python - how to to_csv() with an column of arrays

Question

I have data containing some large arrays in a .csv like:

df = pd.DataFrame()

for x in range(1, 6):
    data = {'a':x,
            'b':np.array(range(1, 10000))}

    df = df.append(data, ignore_index=True)

df.to_csv(f"./src/temp/test.csv", index=True)

The array get saved like:

0 1 [   1    2    3 ... 9997 9998 9999]
1 2 [   1    2    3 ... 9997 9998 9999]
2 3 [   1    2    3 ... 9997 9998 9999]
3 4 [   1    2    3 ... 9997 9998 9999]
4 5 [   1    2    3 ... 9997 9998 9999]

Which makes it impossilbe to read them later. How can I solve this?

EDIT: I was planning to read it later like:

convert = lambda x: np.fromstring(x.strip('[]'), dtype=int, sep=' ')
df = pd.read_csv('./src/temp/test.csv',
                  converters={'b': convert},
                  index_col=0)

Who/what is the intended reader? Pickle it?

DYZ
– DYZ

2021-05-04 06:12:34 +00:00
Commented May 4, 2021 at 6:12 — DYZ
– DYZ, Commented May 4, 2021 at 6:12

tdy · Accepted Answer · 2021-05-04 06:39:32Z

1

You can change the numpy printing threshold with set_printoptions():

threshold: Total number of array elements which trigger summarization rather than full repr (default 1000). To always use the full repr without summarization, pass sys.maxsize.

np.set_printoptions(threshold=100000) # or threshold=sys.maxsize
df.to_csv('./src/temp/test.csv', index=True)

edited May 4, 2021 at 6:39

answered May 4, 2021 at 6:32

tdy

42.1k42 gold badges124 silver badges125 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

python - how to to_csv() with an column of arrays

1 Answer 1

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related