0

I have a pandas DataFrame, which contains 610 rows, and every row contains a nested list of coordinate pairs, it looks like that: enter image description here

[1377778.4800000004, 6682395.377599999] is one coordinate pair.

I want to unnest every row, so instead of one row containing a list of coordinates I will have one row for every coordinate pair, i.e.:

enter image description here

I've tried s.apply(pd.Series).stack() from this question Split nested array values from Pandas Dataframe cell over multiple rows but unfortunately that didn't work.

Please any ideas? Many thanks in advance!

1
  • Could you add a copy of your data that can be copy pasted? Commented Oct 4, 2019 at 11:29

3 Answers 3

1

Here my new answer to your problem. I used "reduce" to flatten your nested array and then I used "itertools chain" to turn everything into a 1d list. After that I reshaped the list into a 2d array which allows you to convert it to the dataframe that you need. I tried to be as generic as possible. Please let me know if there are any problems.

#libraries
import operator
from functools import reduce
from itertools import chain

#flatten lists of lists using reduce. Then turn everything into a 1d list using 
#itertools chain.
reduced_coordinates = list(chain.from_iterable(reduce(operator.concat, 
geometry_list)))

#reshape the coordinates 1d list to a 2d and convert it to a dataframe
df = pd.DataFrame(np.reshape(reduced_coordinates, (-1, 2)))
df.columns = ['X', 'Y']
Sign up to request clarification or add additional context in comments.

6 Comments

Thank you very much! That did work! I am really thankful:)
Actually, it did work only for one of 610 arrays (one array looks like [[[1377909.0626 6696341.0589], [1377913.8088 6696326.7092], ... ]]]). But for my geometry_list which is already a list of 610 such arrays and has shape [array1([[[...]]]), array2([[[...]]]), ... array610([[[...]]])] it still doesn't work...
The problem is also that therse arrays have different lenghtes, so I get ValueError: operands could not be broadcast together with shapes (1,115,2) (1,134,2) - array1 consists of 115 coordinate pairs and array2 has 134 pairs
well, i have modified a little bit your advices, and now it works)))) thank you very much:DDD
hey, glad that it works for you now :)) feel free to give my answer and upvote if you liked it
|
0

One thing you can do is use numpy. It allows you to perform a lot of list/ array operations in a fast and efficient way. This includes "unnesting" (reshaping) lists. Then you only have to convert to pandas dataframe.

For example,

import numpy as np

#your list
coordinate_list = [[[1377778.4800000004, 6682395.377599999],[6582395.377599999, 2577778.4800000004], [6582395.377599999, 2577778.4800000004]]]

#convert list to array
coordinate_array = numpy.array(coordinate_list)
#print shape of array 
coordinate_array.shape

#reshape array into pairs of 
reshaped_array = np.reshape(coordinate_array, (3, 2))

df = pd.DataFrame(reshaped_array)
df.columns = ['X', 'Y']

The output will look like this. Let me know if there is something I am missing. enter image description here

1 Comment

I try to get these coordinates from ArcGis. It looks like this: In: geometry_list = [] for rslt in out['features']: gm = np.array(rslt['geometry']['rings']) geometry_list.append(gm) Out: [array([[[1377778.48 , 6682395.3776], [1377770.949 , 6682351.3558],... It's a list with 610 arrays, each contains diff. number of coordinte pairs, presented as arrays also... When I try to convert list to array, I get a following error: coordinate_array = np.array(geometry_list) ValueError: could not broadcast input array from shape (115,2) into shape (1)
-1
import pandas as pd
import numpy as np

data = np.arange(500).reshape([250, 2])
cols = ['coord']

new_data = []
for item in data:
  new_data.append([item])

df = pd.DataFrame(data=new_data, columns=cols)

print(df.head())

def expand(row):
  row['x'] = row.coord[0]
  row['y'] = row.coord[1]

  return row

df = df.apply(expand, axis=1)
df.drop(columns='coord', inplace=True)
print(df.head())

RESULT

    coord
0  [0, 1]
1  [2, 3]
2  [4, 5]
3  [6, 7]
4  [8, 9]


   x  y
0  0  1
1  2  3
2  4  5
3  6  7
4  8  9

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.