Creating 3D numpy.ndarray with no fixed second dimension

Question

Sometimes data, such as speech data, have a known number of observations (n), an unknown duration, and a known number of measurements (k).

In the 2D case in NumPy, it is clear how data with a known number of observations (n) and an unknown duration is represented with an ndarray of shape (n, ). For example:

import numpy as np

x = np.array([ [ 1, 2 ],
               [ 1, 2, 3 ]
             ])

print(x.shape) ### Returns: (2, )

Is there an equivalent for the 3D case in NumPy, where we could have an ndarray of shape (n, , k)? The best alternative to this I can think of is to have a 2D ndarray of shape (n, ) and have each element also be 2D with a (transpose) shape of (k, ). For example,

import numpy as np

x = np.array([ [ [1,2], [1,2] ],
               [ [1,2], [1,2], [1,2] ]
             ])

print(x.shape) ### Returns: (2, ); Desired: (2, , 2)

Ideally, a solution would be able to tell us the dimensionality properties of an ndarray without the need for a recursive call (maybe with an alternative to shape?).

Your first code snippet is not doing what I think you believe it is doing. When I print the result of it I get array([array([1, 2, 3]), array([1, 2])], dtype=object). This means that you are getting a one dimensional array of objects, which are in this case np.ndarray objects. As for as I am aware it is not possible to allocate an array without a fixed dimension in any direction. — pythonweb
– pythonweb, Commented Apr 7, 2019 at 1:23
Define x as (2,2) object dtype, and set the the elements from x1 and x2. But it is tricky to do this without getting broadcasting errors, — hpaulj
– hpaulj, Commented Apr 7, 2019 at 1:29
It might be easier to create a (4,) array with list or 1d array elements, and if needed reshape that to (2,2). — hpaulj
– hpaulj, Commented Apr 7, 2019 at 1:54
Thank you for the correction, I revised the code with your suggestion. — Joseph Konan
– Joseph Konan, Commented Apr 7, 2019 at 2:08
@JosephKonan: Your revised code is still a one-dimensional array of object dtype. The inner arrays are just Python lists now instead of NumPy arrays. — user2357112
– user2357112, Commented Apr 7, 2019 at 6:38

user2357112 · Accepted Answer · 2019-04-07 07:17:58Z

2

You seem to have misunderstood what a shape of (2,) means. It doesn't mean (2, <unknown>); the comma is not a separator between 2 and some sort of blank dimension. (2,) is the Python syntax for a one-element tuple whose one element is 2. Python uses this syntax because (2) would mean the integer 2, not a tuple.

You are not creating a two-dimensional array with an arbitrary-length second dimension. You are creating a one-dimensional array of object dtype. Its elements are ordinary Python lists. An array like this is incompatible with almost every useful thing in NumPy.

There is no way to create NumPy arrays with variable-length dimensions, whether in the 2D case you thought worked, or in the 3D case you're trying to make work.

answered Apr 7, 2019 at 7:17

user2357112

286k32 gold badges490 silver badges571 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

hpaulj · Accepted Answer · 2019-04-07 06:43:14Z

Just to review the 1d case:

In [33]: x = np.array([[1,2],[1,2,3]])                                          
In [34]: x.shape                                                                
Out[34]: (2,)
In [35]: x                                                                      
Out[35]: array([list([1, 2]), list([1, 2, 3])], dtype=object)

The result is a 2 element array of lists, where as we started with a list of lists. Not much difference.

But note that if the lists are same size, np.array creates a numeric 2d array:

In [36]: x = np.array([[1,2,4],[1,2,3]])                                        
In [37]: x                                                                      
Out[37]: 
array([[1, 2, 4],
       [1, 2, 3]])

So don't count on the behavior we see in [33].

I could create a 2d object array:

In [59]: x = np.empty((2,2),object)                                             
In [60]: x                                                                      
Out[60]: 
array([[None, None],                  # in this case filled with None
       [None, None]], dtype=object)

I can assign each element with a different kind and size of object:

In [61]: x[0,0] = np.arange(3)                                                  
In [62]: x[0,0] = [1,2,3]                                                       
In [63]: x[1,0] = 'abc'                                                         
In [64]: x[1,1] = np.arange(6).reshape(2,3)                                     
In [65]: x                                                                      
Out[65]: 
array([[list([1, 2, 3]), None],
       ['abc', array([[0, 1, 2],
       [3, 4, 5]])]], dtype=object)

It is still 2d. For most purposes it is like a list or list of lists, containing objects. The databuffer actually has pointers to objects stored else where in memory (just as list buffer does).

There really isn't such a thing as a 3d array with a variable last dimension. At best we can get a 2d array that contains lists or arrays of various sizes.

Make a list of 2 2d arrays:

In [69]: alist = [np.arange(6).reshape(2,3), np.arange(4.).reshape(2,2)]        
In [70]: alist                                                                  
Out[70]: 
[array([[0, 1, 2],
        [3, 4, 5]]), array([[0., 1.],
        [2., 3.]])]

In this case, giving it to np.array raises an error: In [71]: np.array(alist)
--------------------------------------------------------------------------- ValueError: could not broadcast input array from shape (2,3) into shape (2)

We could fill an object array with elements from this list:

In [72]: x = np.empty((4,),object)                                              
In [73]: x[0]=alist[0][0]                                                       
In [74]: x[1]=alist[0][1]                                                       
In [75]: x[2]=alist[1][0]                                                       
In [76]: x[3]=alist[1][1]                                                       
In [77]: x                                                                      
Out[77]: 
array([array([0, 1, 2]), array([3, 4, 5]), array([0., 1.]),
       array([2., 3.])], dtype=object)

and reshape it to 2d

In [78]: x.reshape(2,2)                                                         
Out[78]: 
array([[array([0, 1, 2]), array([3, 4, 5])],
       [array([0., 1.]), array([2., 3.])]], dtype=object)

Result is a 2d array containing 1d arrays. To get the shapes of the elements I have to do something like:

In [87]: np.frompyfunc(lambda i:i.shape, 1,1)(Out[78])                          
Out[87]: 
array([[(3,), (3,)],
       [(2,), (2,)]], dtype=object)

Collectives™ on Stack Overflow

Creating 3D numpy.ndarray with no fixed second dimension

2 Answers 2

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related