Updated answer:
The variable y is a numpy array which contained strings and numpy.arrays. Its dtype is object, so numpy doesn't understand it's a table, even though it's full of 4-element numpy.arrays at the end of the preprocessing.
You could either avoid mixing object types by using another variable than y or convert y.values with :
array = np.array([x.astype('int32') for x in y.values])
As an example:
import numpy as np
y = np.array(["left", "right"], dtype = "object")
y[0] = np.array([1,0])
y[1] = np.array([0,1])
print(y)
# [[1 0] [0 1]]
print(y.dtype)
# object
print(y.shape)
# (2,)
y = np.array([x.astype('int32') for x in y])
print(y)
# [[1 0]
# [0 1]]
print(y.dtype)
# int32
print(y.shape)
# (2, 2)
Original answer:
Your array is somehow incomplete. It has 38485 elements, many of which look like 4-elements arrays. But somewhere in the middle, there must be at least one inner-array which doesn't have 4 elements. Or you might have a mix of collections (list, array, ).
That could be why the second value isn't defined in the shape.
Here's an example with one (8, 4) array and a copy of it, with just one element missing:
import numpy as np
data = np.array([[0, 1, 0, 0],[0, 1, 0, 0],[1, 0, 0, 0] , [0, 1, 0, 0], [0, 1, 0, 0], [0, 1, 0, 0], [0, 1, 0, 0],[1, 0, 0, 0]])
print(data.shape)
# (8, 4)
print(data.dtype)
# int64
print(set(len(sub_array) for sub_array in data))
# set([4])
print(data.reshape(-1, 4))
# [[0 1 0 0]
# [0 1 0 0]
# [1 0 0 0]
# [0 1 0 0]
# [0 1 0 0]
# [0 1 0 0]
# [0 1 0 0]
# [1 0 0 0]]
broken_data = np.array([[0, 1, 0, 0],[0, 1, 0, 0],[1, 0, 0, 0] , [1, 0, 0], [0, 1, 0, 0], [0, 1, 0, 0], [0, 1, 0, 0],[1, 0, 0, 0]])
print(broken_data.shape)
# (8, )
print(broken_data.dtype)
# object
print(set(len(sub_array) for sub_array in broken_data))
# set([3, 4])
print(broken_data.reshape(-1, 4))
# [[[0, 1, 0, 0] [0, 1, 0, 0] [1, 0, 0, 0] [1, 0, 0]]
# [[0, 1, 0, 0] [0, 1, 0, 0] [0, 1, 0, 0] [1, 0, 0, 0]]]
print([sub_array for sub_array in broken_data if len(sub_array) != 4])
# [[1, 0, 0]]
Find the sub-arrays that don't have exactly 4 elements and either filter them out or modify them.
You'll then have a (38485,4) array, and you won't have to call reshape.
set(len(sub_array) for sub_array in array)return?reshapecannot change the total number of elements. 38485*4 is larger than the original 38485. But what is thedtypeof your array? Integers or object?