Efficient iteration over 3D array?

Question

I am using Python and Numpy to do some data analysis.

I have a large 3D matrix (NxNxN), where each cell is again a matrix, this time a 3x3 matrix. Calling the matrix data, it looks like this:

data[N,N,N,3,3]

I need to find the eigenvalues of all the 3x3 matrices, and for that I use Numpy's eigvals routine, but it takes ages to do. Right now I pretty much do this:

for i in range(N):
    for j in range(N):
        for k in range(N):
            a = np.linalg.eigvals(data[i,j,k,:,:])

For N = 256, this takes about an hour. Any ideas on how to make this more efficient?

Many thanks for any suggestions!

have you profiled? i suspect you're spending much more time in eigvals than you are iterating. — matt
– matt, Commented Aug 7, 2011 at 16:28
eigvals takes about three orders of magnitude longer by my timeit calculations, so I don't think changing the iteration is going to affect anything. — DSM
– DSM, Commented Aug 7, 2011 at 16:50

Community · Accepted Answer · 2017-05-23 12:15:47Z

5

itertools.product is nicer than nested loops, aesthetically speaking. But I don't think it will make your code that much faster. My testing suggests that iteration is not your bottleneck.

>>> bigdata = numpy.arange(256 * 256 * 256 * 3 * 3).reshape(256, 256, 256, 3, 3)
>>> %timeit numpy.linalg.eigvals(bigdata[100, 100, 100, :, :])
10000 loops, best of 3: 52.6 us per loop

So underestimating:

>>> .000052 * 256 * 256 * 256 / 60
14.540253866666665

That's 14 minutes minimum on my computer, which is pretty new. Let's see how long the loops take...

>>> def just_loops(N):
...     for i in xrange(N):
...         for j in xrange(N):
...             for k in xrange(N):
...                 pass
... 
>>> %timeit just_loops(256)
1 loops, best of 3: 350 ms per loop

Orders of magnitude smaller, as DSM said. Even the work of slicing the array alone is more substantial:

>>> def slice_loops(N, data):
...     for i in xrange(N):
...         for j in xrange(N):
...             for k in xrange(N):
...                 data[i, j, k, :, :]
... 
>>> %timeit slice_loops(256, bigdata)
1 loops, best of 3: 33.5 s per loop

edited May 23, 2017 at 12:15

CommunityBot

11 silver badge

answered Aug 7, 2011 at 17:04

senderle

152k36 gold badges218 silver badges244 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

digitaldingo Over a year ago

Thanks for the very thorough answer! From your tests it does indeed look like there is not much to be done to make it faster.

agf · Accepted Answer · 2011-08-07 16:01:03Z

4

I'm sure there is a good way to do this in NumPy, but in general, itertools.product is faster than nested loops over ranges.

from itertools import product

for i, j, k in product(xrange(N), xrange(N), xrange(N)):
    a = np.linalg.eigvals(data[i,j,k,:,:])

answered Aug 7, 2011 at 16:01

agf

178k45 gold badges300 silver badges241 bronze badges

2 Comments

DSM Over a year ago

In this case -- since N is so small -- I actually find that the loop overhead is roughly twice as large with the product loop than it is with the nested ranges. I still like the product approach better, because it's flatter and the overhead is negligible here anyway.

agf Over a year ago

That's interesting. it would seem to me that creating 65536 + 256 inner lists would be slower (although I didn't expect it to make a huge difference).

HYRY · Accepted Answer · 2011-08-07 21:30:35Z

2

since all the calculation are independent, you can use multiprocessing module to speed up the calculation if you have a multi-core processor.

answered Aug 7, 2011 at 21:30

HYRY

97.8k28 gold badges197 silver badges192 bronze badges

Collectives™ on Stack Overflow

Efficient iteration over 3D array?

3 Answers 3

1 Comment

2 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

1 Comment

2 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related