0

I want to index a 2d array based on a list of arrays.

a = np.array([
    [1,2,3],
    [4,5,6]])

idx = [np.array([0,1], np.array([0,2])]

What I want then want is that the first element in the idx should give a[0,1] and the second a[0,2] and so on such that:

a[fixed_idx] = array([2,3])
2
  • list(np.array([0,1]),np.array([0,2]) is completely invalid syntax. What are you trying to express there? Commented Feb 22, 2022 at 15:39
  • a[0,[1,2]], a[( np.array([0,0]), np.array([1,2]) )] are equivalent ways of selecting those 2 elements. Commented Feb 22, 2022 at 17:14

2 Answers 2

1

IIUC, you could do:

a[tuple(zip(*idx))]

output: array([2, 3])

Sign up to request clarification or add additional context in comments.

3 Comments

That's the problem. OP apparently wants [1,2] as the output.
@richardec that might be an error, a[0,1], a[0,2] is (2,3)
Yes, I understand. Possible the OP meant to use 0,0 and 0,1...?
0

Suppose you have more indices, like:

dummy_idx = [n for n in np.random.randint(100, size=(1000000, 2))]

Then you need to get advanced indices x and y such that a[x, y] gives what you expect.

There are two easy ways to do that:

  1. x, y = zip(*dummy_idx)
  2. x, y = np.transpose(dummy_idx)

First method quite a slow because numpy arrays are not designed for fast iteration and hence it takes quite a long time to access their items in comparison with numpy vectorised actions. On the other hand, np.transpose collects multiple arrays into a new one which is even worse because each step requires to save them in some place of this new array which is even more expensive.

This is a red flag that you're trying to work with data structures numpy is not designed for. Actually, it is slow if you're working with a plenty of small arrays.

However, there are two methods np.ndarray.tolist and np.ndarray.tobytes that are optimized a little bit better for repeated usage. So you could use this advantage and try to mimic behaviour of np.transpose(dummy_idx) in a 30% faster way:

ls = []
for n in dummy_idx: 
    ls.extend(n.tolist())
x, y = np.fromiter(ls, dtype=dummy_idx[0].dtype).reshape(-1, 2).T

and

b = bytearray()
for n in dummy_idx: 
    b.extend(n.tobytes())
x, y = np.frombuffer(b, dtype=dummy_idx[0].dtype).reshape(-1, 2).T

Results

  • zip - 161 ms
  • np.transpose - 205 ms
  • np.fromiter - 117 ms
  • np.frombuffer - 117 ms
  • single looping dummy_idx (in comparison) - 16 ms

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.