105

I need a function that returns non-NaN values from an array. Currently I am doing it this way:

>>> a = np.array([np.nan, 1, 2])
>>> a
array([ NaN,   1.,   2.])

>>> np.invert(np.isnan(a))
array([False,  True,  True], dtype=bool)

>>> a[np.invert(np.isnan(a))]
array([ 1.,  2.])

Python: 2.6.4 numpy: 1.3.0

Please share if you know a better way, Thank you

4 Answers 4

204
a = a[~np.isnan(a)]
Sign up to request clarification or add additional context in comments.

Comments

75

You are currently testing for anything that is not NaN and mtrw has the right way to do this. If you are interested in testing for finite numbers (is not NaN and is not INF) then you don't need an inversion and can use:

np.isfinite(a)

More pythonic and native, an easy read, and often when you want to avoid NaN you also want to avoid INF in my experience.

Just thought I'd toss that out there for folks.

7 Comments

Note: If you want to use isnotnan for filtering pandas, this is the way to go.
@EzekielKruglick if the data is already in pandas, not only is pandas actually faster, but it is more functional as well, given that it includes an index you can use to more easily join on: gist.github.com/jaypeedevlin/fdfb88f6fd1031a819f1d46cb36384da
I think leave it in the comments - the original question is not about pandas.
@JoshD. that's incorrect, Numpy is faster. I commented on your Gist: gist.github.com/jaypeedevlin/… . Basically, you did it wrong -- you're performing the operation on the Pandas object, rather than doing it on the ndarray. Performing the operation on the ndarray is about 25x faster.
@philipKahn Hmm, looks like I did make an error. I was imagining that numpy would cast to an ndarray before it did the operations, so that .values was unnecessary - live and learn!
|
4

To get array([ 1., 2.]) from an array arr = np.array([np.nan, 1, 2]) You can do :

 arr[~np.isnan(arr)]

OR

arr[arr == arr] 

(While : np.nan == np.nan is False)

Comments

2

I'm not sure whether this is more or less pythonic...

a = [i for i in a if i is not np.nan]

1 Comment

It's not appropriate for numpy arrays. Not only do you now get a list back (and thus fundamentally change the nature of the object returned) but this runs in a Python loop and will be orders of magnitude slower than a numpy method. I do not recommend this at all

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.