Converting a List of Tuples to numpy array results in single dimension

Question

We have a list of tuples in the form (year, value):

splist

[(2002, 10.502535211267606),
 (2003, 10.214794520547946),
 (2004, 9.8115789473684227),
  ..
 (2015, 9.0936585365853659),
 (2016, 9.2442725379351387)]

The intention is to convert the list of tuples to a two-D numpy array. However the published answers that use np.asarray retain a single dimension:

dt = np.dtype('int,float')
spp = np.asarray(splist,dt)

spp
   array([(2002, 10.502535211267606), (2003, 10.214794520547946),
   (2004, 9.811578947368423), (2005, 9.684155844155844),
   ..
   (2014, 9.438987341772153), (2015, 9.093658536585366),
   (2016, 9.244272537935139)],
  dtype=[('f0', '<i8'), ('f1', '<f8')])

This becomes clear when viewing the dimensions of the output:

In [155]: spp.shape
Out[155]: (15,)

What we wanted:

   array([[(2002, 10.502535211267606)],
        [(2003, 10.214794520547946)],
   ..
   [(2014, 9.438987341772153)], 
   [(2015, 9.093658536585366)],
   [(2016, 9.244272537935139)]])

So what is the magic to convert the list of tuples to a two dimensional array?

That expected array([(2002, 10.502535211267606)],.. doesn't look like a 2D one. — Divakar
– Divakar, Commented Nov 14, 2016 at 19:49
It still doesn't look like a 2D array per se. It has two dimensions, but all of the data would still be only along the 0'th dimension. i.e., the shape you describe is (15, 1). — Praveen
– Praveen, Commented Nov 14, 2016 at 19:57
@Praveen Not sure how you arrived at that conclusion: it is a correct 2D array now. It was not previously. — WestCoastProjects
– WestCoastProjects, Commented Nov 14, 2016 at 19:59
This isn't a list of tuples issue. It's a question of how to reshape a (15,) array to a (15,1) array. Without the dt you would get a (15,2) array of floats. — hpaulj
– hpaulj, Commented Nov 14, 2016 at 20:23

Praveen · Accepted Answer · 2016-11-14 19:55:15Z

10

If you want a two-dimensional result, just use np.array, instead of asarray:

>>> a = [(2002, 10.502535211267606),
...  (2003, 10.214794520547946),
...  (2004, 9.8115789473684227),
...  (2015, 9.0936585365853659),
...  (2016, 9.2442725379351387)]
>>> np.array(a)    
array([[ 2002.        ,    10.50253521],
       [ 2003.        ,    10.21479452],
       [ 2004.        ,     9.81157895],
       [ 2015.        ,     9.09365854],
       [ 2016.        ,     9.24427254]])
>>> np.array(a).shape
(5, 2)

Note that this will make both columns of floating point dtype. It's not possible to have a 2D numpy array with different dtypes in each column. If you want to do that, I think Pandas has a way: though I don't have any experience with Pandas.

The only thing you can do with numpy is to have a 1D array of "object" type, with each element being a tuple - but that's what you already have with asarray.

edited Nov 14, 2016 at 19:55

answered Nov 14, 2016 at 19:49

Praveen

7,2723 gold badges47 silver badges65 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

WestCoastProjects Over a year ago

np.array was the first thing attempted even before asarray . Uncertain why did not/does not work for me.

WestCoastProjects Over a year ago

It seems to be working. I don't know what sequence of operations made it not work for me previously.

WestCoastProjects Over a year ago

ah! I know now. I had used a scala like type-alias as follows: npa=numpy.array . That works for some things but did the wrong thing in this case: doing spp = npa(splist) ends up with the 15x1 instead of the 15x2 that spp=np.array(splist) gives.

WestCoastProjects Over a year ago

@wwii First: it is required to wait some time before accepting: it was still within the minimum window. Note that I had already upvoted both of them since helpful. Second: there were two correct answers. I had to choose one and went with the other because the datatype is correct on it.

hpaulj Over a year ago

asarray does not make a difference. It's the dt datatype that matters.

Cory Kramer · Accepted Answer · 2016-11-14 19:51:41Z

2

If I understand your desired output correctly, you can use numpy.reshape

>>> spp = np.asarray(splist, dt)
>>> spp
array([(2002, 10.502535211267606),
       (2003, 10.214794520547946),
       (2004, 9.811578947368423),
       (2015, 9.093658536585366),
       (2016, 9.244272537935139)], 
      dtype=[('f0', '<i4'), ('f1', '<f8')])

>>> np.reshape(spp, (spp.size, 1))
array([[(2002, 10.502535211267606)],
       [(2003, 10.214794520547946)],
       [(2004, 9.811578947368423)],
       [(2015, 9.093658536585366)],
       [(2016, 9.244272537935139)]], 
      dtype=[('f0', '<i4'), ('f1', '<f8')])

answered Nov 14, 2016 at 19:51

Cory Kramer

119k19 gold badges176 silver badges233 bronze badges

Collectives™ on Stack Overflow

Converting a List of Tuples to numpy array results in single dimension

2 Answers 2

5 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

5 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related