2

I want to build a numpy.array (of shape (3, 2)) using numpy.fromiter

The numpy array will consist of 3 numpy arrays containing 2 floats each. These 3 arrays will be the output of a custom function but for the example I will use numpy.random.randn.

Inspired by the documentation, my code looks like:

iterable = (np.random.randn(2) for _ in range(3))
np.fromiter(iterable, float, 3)

But I get the following error that I do not understand:

ValueError: setting an array element with a sequence.

I could simply use np.array([np.random.randn(2) for _ in range(3)]) (which works as I want) but in my understanding it would be less efficient since the list is actually built

3
  • What is the full error message? Commented Dec 19, 2019 at 1:14
  • @AMC that is the full error message, except that it tells me the error occurs at line 2 Commented Dec 19, 2019 at 1:17
  • Adding this here, just in case: Are you looking for a 2-dimensional array, or an array/list containing 3 arrays? Commented Dec 19, 2019 at 1:56

2 Answers 2

3

Your iterable produces a sequence of size 2 arrays:

In [273]: for i in iterable:print(i)                                            
[0.72823755 2.04461013]
[-0.17102804  0.14188038]
[-1.1838654   1.01953532]

But fromiter expects a sequence of floats - 1 float at a time.

Create a new 1-dimensional array from an iterable object.

But your list version produces a 2d array!

===

Define a structured array dtype:

In [283]: dt=np.dtype('f,f')                                                    
In [284]: dt                                                                    
Out[284]: dtype([('f0', '<f4'), ('f1', '<f4')])

In [285]: iterable = (np.random.randn(2) for _ in range(3))                     
In [286]: np.fromiter(iterable,dt, 3)                                           
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-286-0301507c38c2> in <module>
----> 1 np.fromiter(iterable,dt, 3)

ValueError: setting an array element with a sequence.

make the iterator produce a sequence of tuples (normal input to a structured array is a list of tuples):

In [287]: iterable = (tuple(np.random.randn(2)) for _ in range(3))              
In [288]: np.fromiter(iterable,dt, 3)                                           
Out[288]: 
array([(-0.56128544,  0.03609687), ( 0.4170706 , -1.5592302 ),
       ( 2.4143908 , -0.96777505)], dtype=[('f0', '<f4'), ('f1', '<f4')])
Sign up to request clarification or add additional context in comments.

8 Comments

Thanks, I should have read more in details. If you have any advice or if you think my "default" solution is fine I would be happy to hear it
This might be one case where I'd be ok mixing itertools with numpy. I've posted an answer.
@ValentinMacé there is a subtle but crucial question: Are you looking for a 2-dimensional array, or an array/list containing 3 arrays?
@ValentinMacé Nope, not at all, which is why I asked! It's also why I find this answer a bit perplexing. What is the easiest way I can send you a small runnable example?
A multidimensional array can be described/viewed/thought off as an array of arrays, etc, the actual storage is more flexible, allowing for easy access along all dimensions, and in various ways (including reshaping and transpose). While it is possible construct an array that explicitly is composed of other arrays, that storage is not as flexible, and not as fast. Focus on the basic multidimensional constructors and methods first.
|
2

The docs explicitly state

Create a new 1-dimensional array from an iterable object.

This is consistent with the fact that you pass in the dtype explicitly. In your case, an ndarray is not a float, hence the error.

You can get around this by flattening the input iterable, e.g., with itertools.chain.from_iterable:

np.fromiter(itertools.chain.from_iterable(iterable), float, 6).reshape(3, 2)

This approach has the advantage that it doesn't build any intermediate data structures, even for the individual rows. A slightly more expensive, but possibly less arcane method would be to expand iterable into itertools.chain directly:

 np.fromiter(itertools.chain(*iterable), float, 6).reshape(3, 2)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.