1

I am writing a program in python and I want to vectorize it as much as possible. I have the following variables

  1. 2D array of zeros E with shape (L,T).
  2. array w with shape (N,) with arbitrary values.
  3. array index with shape (A,) whose values are integers between 0 and N-1. The values are unique.
  4. array labels with a shape the same as w ((A,)), whose values are integers between 0 and L-1. The values are not necessarily unique.
  5. Integer t between 0 and T-1.

We want to add the values of w at indices index to the array E at rows labels and column t. I used the following code:

E[labels,t] += w[index]

But this approach does not give desired results. For example,

import numpy as np

E = np.zeros([10,1])
w = np.arange(0,100)
index = np.array([1,3,4,12,80])
labels = np.array([0,0,5,5,2])
t = 0
E[labels,t] += w[index]

Gives

array([[ 3.],
   [ 0.],
   [80.],
   [ 0.],
   [ 0.],
   [12.],
   [ 0.],
   [ 0.],
   [ 0.],
   [ 0.]])

But the correct answer would be

array([[ 4.],
       [ 0.],
       [80.],
       [ 0.],
       [ 0.],
       [16.],
       [ 0.],
       [ 0.],
       [ 0.],
       [ 0.]])

Is there a way to achieve this behavior without using a for loop?

I realized I can use this: np.add.at(E,[labels,t],w[index]) but it gives me this warning:

FutureWarning: Using a non-tuple sequence for multidimensional indexing is deprecated; use `arr[tuple(seq)]` instead of `arr[seq]`. In the future this will be interpreted as an array index, `arr[np.array(seq)]`, which will result either in an error or a different result.
2
  • I'm sorry but what is indices?? I think you mean index..right?? Commented Mar 1, 2019 at 17:39
  • As the warning indicates use a tuple like np.add.at(E, (labels, t), w[index]) instead of passing a list. Commented Mar 1, 2019 at 19:33

1 Answer 1

1

Pulled from a similar question, you can use np.bincount() to achieve your goal:

import numpy as np
import time

E = np.zeros([10,1])
w = np.arange(0,100)
index = np.array([1,3,4,12,80])
labels = np.array([0,0,5,5,2])
t = 0

# --------- Using np.bincount()
start = time.perf_counter()
for _ in range(10000):
    E = np.zeros([10,1])
    values = w[index]
    result = np.bincount(labels, values, E.shape[0])
    E[:, t] += result
print("Bin count time: {}".format(time.perf_counter() - start))
print(E)


# --------- Using for loop
for _ in range(10000):
    E = np.zeros([10,1])
    for i, in_ in enumerate(index):
        E[labels[i], t] += w[in_]
print("For loop time: {}".format(time.perf_counter() - start))
print(E)

Gives:

Bin count time: 0.045003452
[[ 4.]
 [ 0.]
 [80.]
 [ 0.]
 [ 0.]
 [16.]
 [ 0.]
 [ 0.]
 [ 0.]
 [ 0.]]
For loop time: 0.09853353699999998
[[ 4.]
 [ 0.]
 [80.]
 [ 0.]
 [ 0.]
 [16.]
 [ 0.]
 [ 0.]
 [ 0.]
 [ 0.]]
Sign up to request clarification or add additional context in comments.

2 Comments

np.bincount has a limitation, its input must be non-negative ints.
Agreed, but easily solvable by a quick one-liner to convert negative indices to positive. The weights don’t have to be non-negative ints.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.