How to read csv into multi-dimensional array

Question

I'm trying to read form an CSV where the first four columns are the indexes for a multi-dimensional array. I get the error:

KeyError: 0

from:

sp = []
csvFile = open("sp.csv", "rb")
csvReader = csv.reader(csvFile)
for row in csvReader:
    print row
    sp[int(row[0])][int(row[1])][int(row[2])][int(row[3])] = float(row[4])

Community · Accepted Answer · 2017-05-23 12:31:44Z

2

You need to initialize a dictionary at every dimension eg sp[int(row[0])] needs to be assigned to first before you can access it with [int(row[1])]

Edit. Depending on your use case, you may get away with

sp = {}
sp[(int(row[0]), int(row[1]), ..] = float(row[4])

Yet another edit. I was thinking you might use numpy and ended up at this question: Python multi-dimensional array initialization without a loop which actually reflects your problem. It contains a non-numpy solution as the accepted answer. You'd need to know the dimensions for this, though.

edited May 23, 2017 at 12:31

CommunityBot

11 silver badge

answered May 17, 2014 at 12:36

Nicolas78

5,1441 gold badge26 silver badges43 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

Tjorriemorrie Over a year ago

sigh, so n^4? thought python was better than that :(

Nicolas78 Over a year ago

collections.defaultdict helps but only one level deep

martineau Over a year ago

@Tjorriemorrie: Although it would require a lot more memory, you could make the multi-dimensional array a dictionary of dictionaries and thereby avoid having to preallocate every entry in it.

Nicolas78 Over a year ago

@martineau my bad was thinking of dictionaries already in my comment. However, even then sp[int(row[0])] default to {} and you need to init it at position[int(row[1])] to a new {} before you can assign to it

martineau Over a year ago

@Nicolas78: It's possible to use defaultdict and avoid having to do all that initialization -- it's called autovivification. See my answer.

martineau · Accepted Answer · 2014-06-03 11:35:48Z

2

Instead of an array, you could use a dictionary of dictionaries like this to avoid having to preallocate the entire structure beforehand:

from collections import defaultdict
tree = lambda: defaultdict(tree)

sp = tree()

print 3 in sp[1][2]  # -> False
sp[1][2][3] = 4.1
print 3 in sp[1][2]  # -> True
print sp[1][2][3]  # -> 4.1

sp[9][7][9] = 5.62
sp[4][2][0] = 6.29

edited Jun 3, 2014 at 11:35

answered May 17, 2014 at 13:17

martineau

124k29 gold badges181 silver badges319 bronze badges

3 Comments

Nicolas78 Over a year ago

This is a thing of beauty.

Tjorriemorrie Over a year ago

I can't seem to get this to work, can you perhaps please elaborate a bit more? my sp[1][2][3] returns defaultdict(<function <lambda> at 0x10cf05230>, {})

martineau Over a year ago

That's because you didn't assign a terminal (aka "leaf") value to sp[1][2][3] before referencing its contents, so an empty defaultdict (aka a "branch" node) got created automatically by default. This is instead of a KeyError: 3 being raised because the defaultdict in sp[1][2] -- also automatically created -- doesn't have a value for that key.

ojdo · Accepted Answer · 2014-05-17 13:25:19Z

How about using Numpy? sp.csv might look like this:

0,0,0,4.1
1,1,2,5.2
0,1,1,3.2

Then, using Numpy, reading from file become a one-liner:

import numpy as np
sp = np.loadtxt('sp.csv', delimiter=',')

This yields a 2D record array:

array([[ 0. ,  0. ,  0. ,  4.1],
       [ 1. ,  1. ,  2. ,  5.2],
       [ 0. ,  1. ,  1. ,  3.2]])

Converting this sparse matrix to a full ndarray works like this, assuming 0-based indexing. I'm not happy with the idx= line (there must be a more direct way), but it works:

max_indices = sp.max(0)[:-1]
fl = np.zeros(max_indices + 1)
for row in sp:
    idx = tuple(row[:-1].astype(int))
    fl[idx] = row[-1]

Resulting in the following ndarray fl:

array([[[ 4.1,  0. ,  0. ],
        [ 0. ,  3.2,  0. ]],

       [[ 0. ,  0. ,  0. ],
        [ 0. ,  0. ,  5.2]]])

Collectives™ on Stack Overflow

How to read csv into multi-dimensional array

3 Answers 3

5 Comments

3 Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

5 Comments

3 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related