I am new to dask and am trying to figure out how to reshape a dask array that I've obtained from a single column of a dask dataframe and am running into errors. Wondering if anyone might know of the fix (without having to force a compute)? Thanks!
Example:
import pandas as pd
import numpy as np
from dask import dataframe as dd, array as da
df = pd.DataFrame({'x': [1, 2, 3], 'y': [4, 5, 6]})
ddf = dd.from_pandas(df, npartitions=2)
# This does not work - error ValueError: cannot convert float NaN to integer
ddf['x'].values.reshape([-1,1])
# this works, but requires a compute
ddf['x'].values.compute().reshape([-1,1])
# this works, if the dask array is created directly from a np array
ar = np.array([1, 2, 3])
dar = da.from_array(ar, chunks=2)
dar.reshape([-1,1])