"isnotnan" functionality in numpy, can this be more pythonic?

I need a function that returns non-NaN values from an array. Currently I am doing it this way:

>>> a = np.array([np.nan, 1, 2])
>>> a
array([ NaN,   1.,   2.])

>>> np.invert(np.isnan(a))
array([False,  True,  True], dtype=bool)

>>> a[np.invert(np.isnan(a))]
array([ 1.,  2.])

Python: 2.6.4 numpy: 1.3.0

Please share if you know a better way, Thank you

a = a[~np.isnan(a)]

You are currently testing for anything that is not NaN and mtrw has the right way to do this. If you are interested in testing for finite numbers (is not NaN and is not INF) then you don't need an inversion and can use:

np.isfinite(a)

More pythonic and native, an easy read, and often when you want to avoid NaN you also want to avoid INF in my experience.

Just thought I'd toss that out there for folks.

I'm not sure whether this is more or less pythonic...

a = [i for i in a if i is not np.nan]

Comments
  • Note: If you want to use isnotnan for filtering pandas, this is the way to go.
  • @EzekielKruglick if the data is already in pandas, not only is pandas actually faster, but it is more functional as well, given that it includes an index you can use to more easily join on: gist.github.com/jaypeedevlin/fdfb88f6fd1031a819f1d46cb36384da
  • I think leave it in the comments - the original question is not about pandas.
  • @JoshD. that's incorrect, Numpy is faster. I commented on your Gist: gist.github.com/jaypeedevlin/… . Basically, you did it wrong -- you're performing the operation on the Pandas object, rather than doing it on the ndarray. Performing the operation on the ndarray is about 25x faster.
  • @philipKahn Hmm, looks like I did make an error. I was imagining that numpy would cast to an ndarray before it did the operations, so that .values was unnecessary - live and learn!