How to efficiently extract values from nested numpy arrays generated by loadmat function?

Question

Is there a more efficient method in python to extract data from a nested python list such as A = array([[array([[12000000]])]], dtype=object). I have been using A[0][0][0][0], it does not seem to be an efficinet method when you have lots of data like A.

I have also used numpy.squeeeze(array([[array([[12000000]])]], dtype=object)) but this gives me

array(array([[12000000]]), dtype=object)

PS: The nested array was generated by loadmat() function in scipy module to load a .mat file which consists of nested structures.

after np.squeeze(np.array([[np.array([[12000000]])]], dtype=object)) I've got array(12000000, dtype=object) — aiven
– aiven, Commented Jan 12, 2018 at 20:11
nope, result is just numpy scalar, you can wrap it in int() if you want — aiven
– aiven, Commented Jan 12, 2018 at 20:18
Almost, use np.squeeze(np.array([[np.array([[12000000]])]], dtype=object)).item() — cs95
– cs95, Commented Jan 12, 2018 at 20:19
Just wondering is it output produced by loadmat function? I am getting similar result when I load .mat file using loadmat funciton . — Spandyie
– Spandyie, Commented Jan 12, 2018 at 20:20

hpaulj · Accepted Answer · 2018-01-13 03:54:39Z

3

Creating such an array is a bit tedious, but loadmat does it to handle the MATLAB cells and 2d matrix:

In [5]: A = np.empty((1,1),object)
In [6]: A[0,0] = np.array([[1.23]])
In [7]: A
Out[7]: array([[array([[ 1.23]])]], dtype=object)
In [8]: A.any()
Out[8]: array([[ 1.23]])
In [9]: A.shape
Out[9]: (1, 1)

squeeze compresses the shape, but does not cross the object boundary

In [10]: np.squeeze(A)
Out[10]: array(array([[ 1.23]]), dtype=object)

but if you have one item in an array (regardless of shape) item() can extract it. Indexing also works, A[0,0]

In [11]: np.squeeze(A).item()
Out[11]: array([[ 1.23]])

item again to extract the number from that inner array:

In [12]: np.squeeze(A).item().item()
Out[12]: 1.23

Or we don't even need the squeeze:

In [13]: A.item().item()
Out[13]: 1.23

loadmat has a squeeze_me parameter.

Indexing is just as easy:

In [17]: A[0,0]
Out[17]: array([[ 1.23]])
In [18]: A[0,0][0,0]
Out[18]: 1.23

astype can also work (though it can be picky about the number of dimensions).

In [21]: A.astype(float)
Out[21]: array([[ 1.23]])

With single item arrays like efficiency isn't much of an issue. All these methods are quick. Things become more complicated when the array has many items, or the items are themselves large.

How to access elements of numpy ndarray?

edited Jan 13, 2018 at 3:54

answered Jan 12, 2018 at 20:37

hpaulj

233k14 gold badges260 silver badges392 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Karma Over a year ago

The mat file that I am loading is a matlab nested structure, I tried using squeeze_me = True , but that makes the object of type void

Vermz99 · Accepted Answer · 2018-01-12 20:22:00Z

3

You could use A.all() or A.any() to get a scalar. This would only work if A contains one element.

answered Jan 12, 2018 at 20:22

Vermz99

393 bronze badges

Comments

Sourya Dey · Accepted Answer · 2019-09-05 23:59:43Z

2

Try A.flatten()[0]

This will flatten the array into a single dimension and extract the first item from it. In your case, the first item is the only item.

answered Sep 5, 2019 at 23:59

Sourya Dey

371 bronze badge

Comments

zwep · Accepted Answer · 2019-02-15 12:45:08Z

What worked in my case was the following..

import scipy.io

xcat = scipy.io.loadmat(os.path.join(dir_data, file_name))
pars = xcat['pars']  # Extract numpy.void element from the loadmat object

# Note that you are dealing with a numpy structured array object when you enter pars[0][0]. 
# Thus you can acces names and all that...
dict_values = [x[0][0] for x in pars[0][0]]  # Extract all elements in one go
dict_keys = list(pars.dtype.names)  # Extract the corresponding names/tags
dict_xcat = dict(zip(dict_keys, dict_values))  # Pack it up again in a dict

where the idea behind this is.. first extract ALL values I want, and format them in a nice python dict. This prevents me from cumbersome indexing later in the file...

Of course, this is a very specific solution. Since in my case the values I needed were all floats/ints.

Collectives™ on Stack Overflow

How to efficiently extract values from nested numpy arrays generated by loadmat function?

4 Answers 4

1 Comment

Comments

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

1 Comment

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related