Is there an easier way of doing this array assignment in python/numpy?

Question

I am looking for a more readable way of doing the array assignment (the for loop part).

spike_train = np.zeros((2,2000)) # Array to be filled

spikes = np.random.randint(low=0, high = spike_train.shape[1]-1, size = (2,100)) # Indexes to fill

for i in range(spikes.shape[0]):  # I feel that this part could be more readable
    spike_train[i, spikes[i,:]] = 1

#I was thinking something along these lines:  spike_train[spikes] = 1, but I know this doesn't work

Daniel F · Accepted Answer · 2020-10-27 15:38:13Z

3

You need to broadcast the first dimension (instead of repeat as @Valdi_Bo and @Andre do - which constructs a big intermdiate array):

spike_train = np.zeros((2,2000)) # Array to be filled

spikes = np.random.randint(low=0, high = spike_train.shape[1]-1, size = (2,100)) # Indexes to fill

spike_train[np.arange(spikes.shape[0])[:, None], spikes] = 1

You can also use np.put_along_axis in this case:

np.put_along_axis(spike_train, spikes, values = 1, axis = 1)

edited Oct 27, 2020 at 15:38

answered Oct 27, 2020 at 11:45

Daniel F

14.5k2 gold badges34 silver badges59 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Potatoconomy Over a year ago

Thanks, this taught me something new. np.arange(spikes.shape[0])[:, None] This is a pretty clever way to get an array into that shape!

Andre · Accepted Answer · 2020-10-28 14:40:54Z

2

Edit: As @DanielF pointed out the approach I describe here is not best practice. I'll leave it here for reference, but see @DanielF's asnwer for a more optimal solution.

Another appoach is to get the "x, y coordinates" of the spikes and set those to 1.

With x refering to the row (i.e. 0 or 1) and y referring to the index.

Here's how you could implement this:

import numpy as np

spike_train = np.zeros((2,2000)) # Array to be filled
n_spikes_per_row = 100
x = np.repeat([0,1], n_spikes_per_row)
y = np.random.randint(low=0, high = spike_train.shape[1]-1, size = 2 * n_spikes_per_row) # Indexes to fill

spike_train[x, y] = 1

Maybe a bit redundant to tell, but x and y are flat arrays holding the randomly selected coordinates like so:

x = [0,  0,  0,  0,  0,  1,  1,  1,  1,  1]
y = [13, 5,  7,  4,  9, 14,  1,  1, 17,  4]

As a final comment: since you're selecting random integers you might get the same value in y multiple times, this leads to fewer 'spikes' in the final array. Maybe you already knew or it does not matter fot your implementation but I wanted to have mentioned it just in case. Good luck!

edited Oct 28, 2020 at 14:40

answered Oct 27, 2020 at 12:37

Andre

7983 silver badges14 bronze badges

5 Comments

Potatoconomy Over a year ago

Thanks! All of the answers provided on this thread offered great solutions, but I think this is the easiest one to explain to beginning students.

Daniel F Over a year ago

@Potatoconomy Please don't teach beginning students not to use broadcasting. We spend so much time trying to beat that out of people in our answers here. Using repeat isn't easily extensible to higher dimensions, is slow and wasteful of memory, and ignores one of the fundamental benefits of numpy.

Daniel F Over a year ago

Also, even if you don't want to broadcast, using np.put_along_axis (see my edit) is better practice as well

Andre Over a year ago

@DanielF, Thanks for the comment, I didn't know that! I'll edit to my answer.

Daniel F Over a year ago

@Andre Sorry for putting you on blast. I normally wouldn't do that but if they're going to be teaching it to others I'd rather make sure best practices are followed.

Valdi_Bo · Accepted Answer · 2020-10-27 11:07:48Z

1

One possibility to write the indicated elements without an explicit loop is:

spike_train[(np.repeat(np.arange(spikes.shape[0]), spikes.shape[1]), spikes.flatten())] = 1

But the execution time is even longer.

Another option is to change the last instruction to:

spike_train[i, spikes[i]] = 1

(no need to explicitely pass the ":" for the second dimension). This time execution is a bit faster than your original code.

answered Oct 27, 2020 at 11:07

Valdi_Bo

31.1k4 gold badges29 silver badges45 bronze badges

Comments

fountainhead · Accepted Answer · 2020-10-27 13:45:43Z

I'm not sure about the readability part, but if your intention is really to eliminate the for loop, you could try this (my real answer is in the penultimate line of the code - the rest of the code is just setting up the demo data or just printing stuff):

import numpy as np

TRAIN_LEN = 10                       # Should be 2000 in reality

# Initialize with sequential values, just to demonstrate that this answer
# works correctly, and updates with ONES in the right places
spike_train = np.arange(2*TRAIN_LEN).reshape (2,TRAIN_LEN)

print (spike_train)
SPIKES_LEN = 5                        # Should be 100 in reality
spikes = np.random.randint(low=0, high = spike_train.shape[1]-1, size = (2,SPIKES_LEN)) # Indexes to fill
print (spikes)

# Here's where we actually do the update
spike_train[np.arange(spikes.shape[0], dtype=np.intp).reshape(-1,1), spikes] = 1
print(spike_train)

This prints:

[[ 0  1  2  3  4  5  6  7  8  9]
 [10 11 12 13 14 15 16 17 18 19]]
[[4 8 0 6 0]
 [8 8 4 3 3]]

and, after the update:

[[ 1  1  2  3  1  5  1  7  1  9]
 [10 11 12  1  1 15 16 17  1 19]]

Notice that the 2x10 array of sequential numbers is now updated, to have 1's in all the right places.

Collectives™ on Stack Overflow

Is there an easier way of doing this array assignment in python/numpy?

4 Answers 4

1 Comment

5 Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

1 Comment

5 Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related