3

Can someone help me to wrap my head around numpy?

In the following code I expect that col1 would give me an array of shape (2, 3) just as the expected_arr. But apparently col1 seems to have shape (2,). I suppose that means it's an array with two tupels (instead of an array with two arrays with 3 values each.

import numpy as np
import random
from collections import deque 

vals = np.array([
  [[1, 2, 3], False],
  [[4, 5, 6], False]
])

col1 = vals[:,0]

print(col1)
print(col1.shape)

expected_arr = np.array([[1, 2, 3], [4, 5, 6]])

print(expected_arr)
print(expected_arr.shape)

So, what I want is, given a structure of vals, I'd like to get the first column so that the output is an array of shape (2,3).

Can someone help me out here?

2 Answers 2

3

In this case the array vals was constructed from a structure that cannot be interpreted as an array (i.e. contiguous block of same-sized elements) of one of the basic numerical types. Your list from which the array is being created is a list of mixed-type elements.

When this happens numpy's array constructor tries to create a generic array with the dtype being "object", i.e. just plain Python object (actually, references to them). It is quite analogous to a pointer array in C. vals is thus a 2 x 2 array holding in it the objects

the list `[1, 2, 3]` | the bool `False`
---------------------------------------
the list `[4, 5, 6]` | the bool `False`

as a row-major array in memory.

As col1 is a slice-notation indexing of the 2-dimensional array vals, you get an array of one-dimension that contains two elements, namely the two Python lists.

Sign up to request clarification or add additional context in comments.

1 Comment

Makes sense. Thanks!
2

When you create this array:

vals = np.array([
  [[1, 2, 3], False],
  [[4, 5, 6], False]
])

this is not a recatngular array, so numpy makes it an object array

vals.dtype
dtype('O')

with shape (2,2). The objects in the array are two lists and two boolean values.

When you index just the list, the resulting array still has dtype('O'):

vals[:,0].dtype
dtype('O')

which means it's still an array of list objects. To convert this to a full array, you'll need to use np.vstack

np.vstack(vals[:,0])
array([[1, 2, 3],
       [4, 5, 6]])

Object arrays aren't really effcient in numpy. They don't take advantage of any of the optimizations and casue lots of problems converting back and forth. You can try structured arrays, or seperate arrays.

1 Comment

Great answer! Thank you so much!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.