1

Suppose I have a numpy array from which I want to remove a specific element.

# data = np.array([ 97  32  98  32  99  32 100  32 101])
# collect indices where the element locate 
indices = np.where(data==32)
without_32 = np.delete(data, indices)
# without_32 become [ 97  98  99 100 101]

Now, suppose I want to restore the array (As I already have the indices where I should put the value 32).

restore_data = np.insert(without_32, indices[0], 32)

But it gives IndexError: index 10 is out of bounds for axis 0 with size 9. IS there other way to implement that?

update

It seems after delete the element I need some adjust for the indices like

restore_data = np.insert(without_32, indices[0]-np.arange(len(indices[0])), 32)

But Can I generalize this? Like not only 32 but also trace 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47. I mean I want to trace the same way for 32-47 in a efficient way.

5
  • np.insert(without_32, indices[0]-np.arange(len(indices[0])), 32), IIUC Commented May 6, 2022 at 13:47
  • It works @MichaelSzczesny, Could you explain what indices[0]-np.arange(len(indices[0])) do here? Thanks Commented May 6, 2022 at 13:54
  • This adjusts the indices to the correct insertion points for np.insert. I'm hesitant to answer this question as it is redundant since you must already have the original array. This looks like a xy problem. Commented May 6, 2022 at 14:32
  • I suspect there are better solutions for the actual use case. As a general rule, try to avoid np.delete and np.insert. Commented May 6, 2022 at 14:41
  • I see. Can I generalize this? Like not only 32 but also trace 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47. I mean I want to trace the same way for 32-47. Can I do this in a better way. And thanks for your response. @MichaelSzczesny Commented May 9, 2022 at 14:48

1 Answer 1

1

My alternative:

#define a mask 
mask = data==32
mask #array([False,  True, False,  True, False,  True, False,  True, False])

#filter
without_32 = data[~mask]
without_32 #array([ 97,  98,  99, 100, 101])

Then if you want to have the original data:

restore_data = np.ones_like(mask, dtype=int)*32
restore_data[~mask] = without_32
restore_data

output:

array([ 97,  32,  98,  32,  99,  32, 100,  32, 101])

In practice you are generating a constant array of 32 with length equal to mask (and obviously to data) and then you are filling the position in which mask is False (the positions in which data!=32) with the without_32 array

UPDATE

In order to answer to your update:

data = np.random.randint(20, 60, size=20)
#array([47, 39, 29, 45, 21, 44, 48, 27, 21, 25, 47, 59, 58, 53, 46, 36, 34, 57, 36, 54])

mask = (data>=32)&(data<=47) #the values you want to remove

clean_array = data[~mask] #data you want to retain
removed_data = data[mask] #data you want to remove

Now you can del data, you can do whatever you want with clean_array, and when you need to reconstruct the original array you just:

restore_data = np.zeros_like(mask, dtype=int)
restore_data[~mask] = clean_array
restore_data[mask] = removed_data
#array([47, 39, 29, 45, 21, 44, 48, 27, 21, 25, 47, 59, 58, 53, 46, 36, 34, 57, 36, 54])
Sign up to request clarification or add additional context in comments.

8 Comments

Which one is better? store the mask or storing the indices as MichaelSzczesny said in the comment? I will implement this code with a huge array (thousand of thousand length) @SalvatoreDanieleBianco.
it dipends. The bool type is more "convenient" than the int type, but the array of indices is shorter than the mask. If you expect to find very less 32 use the indices; if you expect to find a lot of 32 use the mask. You can check the memory usage of an array in this way: stackoverflow.com/questions/11784329/… . In your example the indices array is more convenient than the mask.
I see. Can I generalize this? Like not only 32 but also trace 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47. I mean I want to trace the same way for 32-47. Can I do this in a better way. And thanks for your response. I will accept the answer. @SalvatoreDanieleBianco
compression won't be issue here. Rather removing those value for some intermediate processing is important here. It will be a great help if you helped me by sharing an efficient code. Thanks again @SalvatoreDanieleBianco
Thanks @SalvatoreDanieleBianco. Your update solution will done my done.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.