1

Assume that I have a NumPy array A with n dimensions, which might be very large, and assume that I have k one-dimensional Boolean masks M1, ..., Mk

I would like to extract an n-dimensional array, B, from A which contains all the elements of A located at indices where the "outer-AND" of all the masks is True.

...but I would like to do this without first forming the (possibly very large) "outer-AND" of all the masks, and without having to extract the specified elements from each axis one axis at a time hence creating (possibly many) intermediate copies in the process.

The example below demonstrates the two ways of extracting the elements from A just described above:

from functools import reduce
import numpy as np


m = 100

for _ in range(m):
    n = np.random.randint(0, 10)
    k = np.random.randint(0, n + 1)

    A_shape = tuple(np.random.randint(0, 10, n))

    A = np.random.uniform(-1, 1, A_shape)
    M_lst = [np.random.randint(0, 2, dim).astype(bool) for dim in A_shape]

    # Creating shape of B:
    B_shape = tuple(map(np.count_nonzero, M_lst)) + A_shape[len(M_lst):]
    # size of B:
    B_size = np.prod(B_shape)

    # --- USING "OUTER-AND" OF ALL MASKS --- #
    # Creating "outer-AND" of all masks:
    M = reduce(np.bitwise_and, (np.expand_dims(M, tuple(np.r_[:i, i+1:n])) for i, M in enumerate(M_lst)), True)
    # extracting elements from A and reshaping to the correct shape:
    B1 = A[M].reshape(B_shape)
    # Checking that the correct number of elements was extracted
    assert B1.size == B_size
    # THE PROBLEM WITH THIS METHOD IS THE POSSIBLY VERY LARGE OUTER-AND OF ALL THE MASKS!

    # --- USING ONE MASK AT A TIME --- #
    B2 = A
    for i, M in enumerate(M_lst):
        B2 = B2[tuple(slice(None) for _ in range(i)) + (M,)]
    assert B2.size == np.prod(B_shape)
    assert B2.shape == B_shape
    # THE PROBLEM WITH THIS METHOD IS THE POSSIBLY LARGE NUMBER OF POSSIBLY LARGE INTERMEDIATE COPIES!

    assert np.all(B1 == B2)

    # EDIT 1:
    # USING np.ix_ AS SUGGESTED BY Chrysophylaxs
    i = np.ix_(*M_lst)
    B3 = A[i]
    assert B3.shape == B_shape
    assert B3.size == B_size
    assert np.prod(list(map(np.size, i))) == B_size

print(f'All three methods worked all {m} times')

Is there a smarter (more efficient) way to do this, possibly using an existing NumPy function?

1 Answer 1

5

IIUC, you're looking for np.ix_; an example:

import numpy as np

arr = np.arange(60).reshape(3, 4, 5)

x = [True, False, True]
y = [False, True, True, False]
z = [False, True, False, True, False]

out = arr[np.ix_(x, y, z)]

out:

array([[[ 6,  8],
        [11, 13]],

       [[46, 48],
        [51, 53]]])
Sign up to request clarification or add additional context in comments.

1 Comment

Yep this seems to work, thx a lot!, I added it to the list of methods in my loop...

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.