2

I have an array "removable" containing a few numbers from another array "All" containing all numbers from 0 to k.

I want to remove all numbers in A which are listed in removable.

All = np.arange(k)
removable = np.ndarray([1, 3, 4 , 7, 9, ..., 200])

for i in removable:
    if i in All:
        All.remove(i)

ndarray has no remove attribute, but I'm sure there is an easy method in numpy to solve this problem, but I can't find it in the documentation.

2
  • Why not use lists? Commented Feb 5, 2019 at 14:50
  • I get the removable from another method, sadly im not able to change it. Commented Feb 5, 2019 at 14:51

5 Answers 5

5

You could use the function setdiff1d from NumPy:

>>> a = np.array([1, 2, 3, 2, 4, 1])
>>> b = np.array([3, 4, 5, 6])
>>> np.setdiff1d(a, b)
array([1, 2])
Sign up to request clarification or add additional context in comments.

4 Comments

Note that this will de-duplicate the original entries in a (not done by the pseudocode in the question), and the result will be sorted
Thats true, however the np.arange(k) provides a list without duplicates. My answer will not work with duplicates.
Oh snap, setdiff1d is even faster than explicit set conversion and differencing. I guess that makes sense, probably more optimized. I didn't know numpy had this!
Now the question is if OP wants duplicates or deduplicates
2

np.setdiff1d() will de-duplicate the original entries, and will also return the result sorted.

That's fine in some cases, but if you want to avoid one or both of these aspects, have a look at np.in1d() with an (inverted) boolean mask:

>>> a = np.array([1, 2, 3, 2, 4, 1])                                                                                                                                                                                                                    
>>> b = np.array([3, 4, 5, 6])                                                                                                                                                                                                                          
>>> a[~np.in1d(a, b)]                                                                                                                                                                                                                                   
array([1, 2, 2, 1])

The ~ operator does inversion on the boolean mask:

>>> np.in1d(a, b)                                                                                                                                                                                                                                       
array([False, False,  True, False,  True, False])

>>> ~np.in1d(a, b)                                                                                                                                                                                                                                      
array([ True,  True, False,  True, False,  True])

Disclaimer:

Note that this is not truly removal, as you indicated in your question; the result is a view into filtered elements of the original array a. Same goes for np.delete(); there's no concept of in-place element deletion for NumPy arrays.

Comments

1

Solution - fast for big arrays, no need to transform into list (slowing down computation)

orig=np.arange(15)
to_remove=np.array([1,2,3,4])
mask = np.isin(orig, to_remove)
orig=orig[np.invert(mask)]

>>> orig
array([ 0,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14])

3 Comments

np.isin() calls np.asarray() + np.in1d() and does reshaping. If both of the inputs are 1d, those checks are probably not needed
no need to say its impossible or to give different solution than what is being asked from OP
constructive criticism always welcome. We are all fellow programmers here, some having learnt a lot already, others just starting. Help correct mistakes or dont, but there is absolutely no need to mock them.
-1

numpy arrays have a fixed shape, you cannot remove elements from them.

You cannot do this with ndarrays.

1 Comment

Upvote to counter the downvotes. Perhaps pedantic, but not incorrect. And not an unimportant aspect about ndarrays to appreciate, on the path to a good solution to this problem.
-1

You should do this with sets instead of lists/arrays, which is easy enough:

remaining = np.array(set(arr).difference(removable))

where arr is your All array above ("all" is a keyword and should not be overwritten).

Granted, using sets will get rid of repeated elements if you have those in your arr, but it sounds like arr is just a sequence of unique values. Sets have much more efficient membership checking (constant-time vs. order N), so you get to go a lot faster. By comparison, I made a list version that builds a list if a value is not in removable:

def remove_list(arr, rem):
    result = []
    for i in arr:
        if i not in rem:
            result.append(i)
    return result

and made my set version a function as well:

def remove_set(arr, rem):
    return np.array(set(arr).difference(rem))

Timing comparison with arr = np.arange(10000) and removable = np.random.randint(0, 10000, 1000):

remove_list(arr, removable)
# 55.5 ms ± 664 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

remove_set(arr, removable)
# 947 µs ± 3.8 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

Set is 50 times faster.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.