3

I have an array

[[0, 1, 0, 0] [0, 1, 0, 0] [1, 0, 0, 0] ..., [0, 1, 0, 0] [0, 1, 0, 0] [1, 0, 0, 0]] of Shape(38485,) i want to reshape to (38485,4) like

[[0, 1, 0, 0] 
[0, 1, 0, 0] 
[1, 0, 0, 0]
.
.
.
[0, 1, 0, 0]
[0, 1, 0, 0]
[1, 0, 0, 0]]

but when i try array.reshape(-1,4) it throws me the error ValueError: cannot reshape array of size 38485 into shape (4)

My code to get array:

dataset = pd.read_csv('train.csv')

y = dataset.iloc[:, 6]

fr=np.array([1,0,0,0])
re=np.array([0,1,0,0])
le=np.array([0,0,1,0])
ri=np.array([0,0,0,1])
for i in range(y.shape[0]):
    if y[i]=="Front":
        y[i]=fr
    elif y[i]=="Rear":
        y[i]=re
    elif y[i]=="Left":
        y[i]=le
    elif y[i]=="Right":
        y[i]=ri

array=y.values

Is there any way I can accomplish this?

I Fixed this by

array = np.array([[n for n in row] for row in array])

Thanks to wim

2
  • What does set(len(sub_array) for sub_array in array) return? Commented May 30, 2017 at 16:47
  • reshape cannot change the total number of elements. 38485*4 is larger than the original 38485. But what is the dtype of your array? Integers or object? Commented May 30, 2017 at 17:24

2 Answers 2

2

Updated answer:

The variable y is a numpy array which contained strings and numpy.arrays. Its dtype is object, so numpy doesn't understand it's a table, even though it's full of 4-element numpy.arrays at the end of the preprocessing.

You could either avoid mixing object types by using another variable than y or convert y.values with :

array = np.array([x.astype('int32') for x in y.values])

As an example:

import numpy as np
y = np.array(["left", "right"], dtype = "object")
y[0] = np.array([1,0])
y[1] = np.array([0,1])
print(y)
# [[1 0] [0 1]]
print(y.dtype)
# object
print(y.shape)
# (2,)
y = np.array([x.astype('int32') for x in y])
print(y)
# [[1 0]
#  [0 1]]
print(y.dtype)
# int32
print(y.shape)
# (2, 2)

Original answer:

Your array is somehow incomplete. It has 38485 elements, many of which look like 4-elements arrays. But somewhere in the middle, there must be at least one inner-array which doesn't have 4 elements. Or you might have a mix of collections (list, array, ).

That could be why the second value isn't defined in the shape.

Here's an example with one (8, 4) array and a copy of it, with just one element missing:

import numpy as np

data = np.array([[0, 1, 0, 0],[0, 1, 0, 0],[1, 0, 0, 0] , [0, 1, 0, 0], [0, 1, 0, 0], [0, 1, 0, 0], [0, 1, 0, 0],[1, 0, 0, 0]])
print(data.shape)
# (8, 4)
print(data.dtype)
# int64
print(set(len(sub_array) for sub_array in data))
# set([4])
print(data.reshape(-1, 4))
# [[0 1 0 0]
#  [0 1 0 0]
#  [1 0 0 0]
#  [0 1 0 0]
#  [0 1 0 0]
#  [0 1 0 0]
#  [0 1 0 0]
#  [1 0 0 0]]

broken_data = np.array([[0, 1, 0, 0],[0, 1, 0, 0],[1, 0, 0, 0] , [1, 0, 0], [0, 1, 0, 0], [0, 1, 0, 0], [0, 1, 0, 0],[1, 0, 0, 0]])
print(broken_data.shape)
# (8, )
print(broken_data.dtype)
# object
print(set(len(sub_array) for sub_array in broken_data))
# set([3, 4])
print(broken_data.reshape(-1, 4))
# [[[0, 1, 0, 0] [0, 1, 0, 0] [1, 0, 0, 0] [1, 0, 0]]
#  [[0, 1, 0, 0] [0, 1, 0, 0] [0, 1, 0, 0] [1, 0, 0, 0]]]
print([sub_array for sub_array in broken_data if len(sub_array) != 4])
# [[1, 0, 0]]

Find the sub-arrays that don't have exactly 4 elements and either filter them out or modify them.

You'll then have a (38485,4) array, and you won't have to call reshape.

Sign up to request clarification or add additional context in comments.

13 Comments

my print(set(len(sub_array) for sub_array in broken_data)) prints {4} and i'm sure there are no missing values in the inner list. print([sub_array for sub_array in broken_data if len(sub_array) != 4]) prints [] which proves i have no broken arrays
Interesting. What about print(set(type(sub_array) for sub_array in broken_data))?
it prints {<class 'numpy.ndarray'>}
Thanks for the fast answers. What about print(set(sub_array.dtype for sub_array in broken_data))? :)
it prints '{dtype('int32')}' I should the one thanking :)
|
1

The array length must be a multiple of 4. 38485 is not a multiple of 4. Otherwise, the reshape as you have written it should work correctly:

array.reshape(-1,4)

4 Comments

I tried to delete an element from the array to make the size (38484,) after reshaping with array.reshape(-1,4) I got (9621, 4) but I need (38484,4). When I print the array I get all the elements on next to other in the same line, but I need them one below the other, like I mentioned in the question. I'm Sure that every inner array has 4 elements.
What is the array.dtype?
@wim I'd bet on object
This this: array = np.array([[n for n in row] for row in array]) and see if it resolves your problem.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.