1

I am trying to read data from a csv file into a numpy array. Since the csv file contains empty fields, I read all of the data into an array of dtype=str, and plan to convert rows/columns into appropriate numerical types. The example below is my unsuccessful at converting these array dtypes.

import numpy as np

x = np.array([
['name', 'property', 'value t0', 'value t1', 'value t2'],
['a', 0.5, 1, 2, 3],
['b', 0.2, 5, 10, 100],
['c', 0.7, 3, 6, 9],
], dtype=str)

First, let's view the original array.

# print("\n .. x (shape={}, dtype={}):\n{}\n".format(x.shape, x.dtype, x))
[['name' 'property' 'value t0' 'value t1' 'value t2']
 ['a' '0.5' '1' '2' '3']
 ['b' '0.2' '5' '10' '100']
 ['c' '0.7' '3' '6' '9']]

Then, let's make sure the numerical entries (taken from the first row down and second column right) can be converted into type <int>.

# print(x[1:, 2:].astype(int))
[[  1   2   3]
 [  5  10 100]
 [  3   6   9]]

So, I tried to put these concepts together.

# # x[1:, 2:] = x[1:, 2:].astype(int)
# x[1:, 2:] = np.array(x[1:, 2:], dtype=int)

print(x)
[['name' 'property' 'value t0' 'value t1' 'value t2']
 ['a' '0.5' '1' '2' '3']
 ['b' '0.2' '5' '10' '100']
 ['c' '0.7' '3' '6' '9']]

Why are the selected entries remaining strings? I saw similar questions posted, for which the accepted solution appears to be using named-fields. But, I prefer numerical indexing to named-fields for my use-case.

3
  • 1
    You can't apply different dtypes to different parts of an array. It looks like you should probably be using something like Pandas, not NumPy directly. Commented Mar 19, 2020 at 1:54
  • Take a look at Structured Array Commented Mar 19, 2020 at 1:57
  • The named field, structured array, approach would allow [('name','U1'),('property',float), ...] dtype. An alternative is object dtype, where elements are stored in a list-like manner. Otherwise you can't have a mix of dtypes. A pandas dataframe would also have named columns, and a separate Series for each column. Commented Mar 19, 2020 at 2:37

1 Answer 1

2
In [83]: alist = [ 
    ...: ['name', 'property', 'value t0', 'value t1', 'value t2'], 
    ...: ['a', 0.5, 1, 2, 3], 
    ...: ['b', 0.2, 5, 10, 100], 
    ...: ['c', 0.7, 3, 6, 9], 
    ...: ]                                                                                                           
In [84]: alist                                                                                                       
Out[84]: 
[['name', 'property', 'value t0', 'value t1', 'value t2'],
 ['a', 0.5, 1, 2, 3],
 ['b', 0.2, 5, 10, 100],
 ['c', 0.7, 3, 6, 9]]
In [85]: np.array(alist)                                                                                             
Out[85]: 
array([['name', 'property', 'value t0', 'value t1', 'value t2'],
       ['a', '0.5', '1', '2', '3'],
       ['b', '0.2', '5', '10', '100'],
       ['c', '0.7', '3', '6', '9']], dtype='<U8')

object array:

In [87]: np.array(alist, dtype=object)                                                                               
Out[87]: 
array([['name', 'property', 'value t0', 'value t1', 'value t2'],
       ['a', 0.5, 1, 2, 3],
       ['b', 0.2, 5, 10, 100],
       ['c', 0.7, 3, 6, 9]], dtype=object)

structured array:

In [88]: np.array([tuple(row) for row in alist[1:]], dtype='U1,f,i,i,i')                                             
Out[88]: 
array([('a', 0.5, 1,  2,   3), ('b', 0.2, 5, 10, 100),
       ('c', 0.7, 3,  6,   9)],
      dtype=[('f0', '<U1'), ('f1', '<f4'), ('f2', '<i4'), ('f3', '<i4'), ('f4', '<i4')])

pandas:

In [90]: import pandas as pd                                                                                         
In [91]: pd.DataFrame(alist[1:], columns=alist[0])                                                                   
Out[91]: 
  name  property  value t0  value t1  value t2
0    a       0.5         1         2         3
1    b       0.2         5        10       100
2    c       0.7         3         6         9
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.