1

I am trying to convert a pandas dataframe into an array of structs. In the end I expect to have structs with certain numpy datatypes. I tried to use .to_numpy:

df = pd.DataFrame({
    "name": ["John", "Bill", "William"],
    "age": [25, 34, 41],
})
arr_from_df = df.to_numpy()
print(arr_from_df.dtype)  # object
print(arr_from_df.dtype.fields)  # None

So I cannot see any information about the datatypes of the values in the array (in arr_from_df.dtype)... I need the result as if I create a numpy array directly in this way:

arr = np.array(
    [("John", 25), ("Bill", 34), ("William", 41)],
    dtype=[("name", "U16"), ("age", np.int32)]
)
print(arr.dtype)  # [('name', '<U16'), ('age', '<i4')]
print(arr.dtype.fields)  # {'name': (dtype('<U16'), 0), 'age': (dtype('int32'), 64)}

How do I convert it?

I would prefer to use build-in pandas or numpy functions because of the performance. So I would not like to use native python loops.

1

0

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.