1

I am working with a dataframe, that contains arrays. Upon read_cvs, pandas seems to be storing my vetors in str mode. Like this:

df['column'].iloc[3]
>>>'[50.6402809, 4.6667145]'

type(df['column'].iloc[3])
>>> str

How can I convert the entire column to array? Like so:

df['column'].iloc[3]
>>>[50.6402809, 4.6667145]

type(df['column'].iloc[3])
>>> array
1

2 Answers 2

2

If want numpy arrays use lambda function with ast.literal_eval and convert to arrays:

import ast

df['column'] = df['column'].apply(lambda x: np.array(ast.literal_eval(x)))

And if need lists:

df['column'] = df['column'].apply(ast.literal_eval)

df['column'] = [ast.literal_eval(x) for x in df['column']]
Sign up to request clarification or add additional context in comments.

Comments

1

You could use the ast module to interpret the strings literally. However, this can be dangerous, especially when reading the data from a file or worse, online.

An alternative would be to parse the file directly using series.str functions:

In [19]: parsed = (
    ...:     df['column']
    ...:     .str.strip('[]')
    ...:     .str.split(', ', )
    ...:     .apply(lambda x: np.array(x).astype(float)))
    ...:

In [20]: parsed
Out[20]:
0    [0.45482146988492345, 0.40132331304489344]
1      [0.4820128044982769, 0.6930103661982894]
2      [0.15845986027370507, 0.825879918750825]
3      [0.08389109330674027, 0.031864037778777]
Name: column, dtype: object

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.