Indexing a 2d array using a list of arrays

Question

I want to index a 2d array based on a list of arrays.

a = np.array([
    [1,2,3],
    [4,5,6]])

idx = [np.array([0,1], np.array([0,2])]

What I want then want is that the first element in the idx should give a[0,1] and the second a[0,2] and so on such that:

a[fixed_idx] = array([2,3])

list(np.array([0,1]),np.array([0,2]) is completely invalid syntax. What are you trying to express there? — user17242583
– user17242583, Commented Feb 22, 2022 at 15:39
a[0,[1,2]], a[( np.array([0,0]), np.array([1,2]) )] are equivalent ways of selecting those 2 elements. — hpaulj
– hpaulj, Commented Feb 22, 2022 at 17:14

mozway · Accepted Answer · 2022-02-22 15:39:14Z

1

IIUC, you could do:

a[tuple(zip(*idx))]

output: array([2, 3])

answered Feb 22, 2022 at 15:39

mozway

267k13 gold badges56 silver badges106 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

user17242583 Over a year ago

That's the problem. OP apparently wants [1,2] as the output.

mozway Over a year ago

@richardec that might be an error, a[0,1], a[0,2] is (2,3)

user17242583 Over a year ago

Yes, I understand. Possible the OP meant to use 0,0 and 0,1...?

mathfux · Accepted Answer · 2022-02-23 07:32:59Z

Suppose you have more indices, like:

dummy_idx = [n for n in np.random.randint(100, size=(1000000, 2))]

Then you need to get advanced indices x and y such that a[x, y] gives what you expect.

There are two easy ways to do that:

x, y = zip(*dummy_idx)
x, y = np.transpose(dummy_idx)

First method quite a slow because numpy arrays are not designed for fast iteration and hence it takes quite a long time to access their items in comparison with numpy vectorised actions. On the other hand, np.transpose collects multiple arrays into a new one which is even worse because each step requires to save them in some place of this new array which is even more expensive.

This is a red flag that you're trying to work with data structures numpy is not designed for. Actually, it is slow if you're working with a plenty of small arrays.

However, there are two methods np.ndarray.tolist and np.ndarray.tobytes that are optimized a little bit better for repeated usage. So you could use this advantage and try to mimic behaviour of np.transpose(dummy_idx) in a 30% faster way:

ls = []
for n in dummy_idx: 
    ls.extend(n.tolist())
x, y = np.fromiter(ls, dtype=dummy_idx[0].dtype).reshape(-1, 2).T

and

b = bytearray()
for n in dummy_idx: 
    b.extend(n.tobytes())
x, y = np.frombuffer(b, dtype=dummy_idx[0].dtype).reshape(-1, 2).T

Results

zip - 161 ms
np.transpose - 205 ms
np.fromiter - 117 ms
np.frombuffer - 117 ms
single looping dummy_idx (in comparison) - 16 ms

Collectives™ on Stack Overflow

Indexing a 2d array using a list of arrays

2 Answers 2

3 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

3 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related