1

I have a two-dimensional array of data. I need to average every two rows, and return the average with an array half of the height. I also need to ignore all NaN values for averaging purposes. For example:

>>> x = numpy.array([[ 1,  nan,  3,  4,  5],
... [ 6,  7,  8,  9, nan],
... [11, 12, 13, 14, nan],
... [16, nan, 18, 19, nan]])

And the function would need to return:

>>> x
array([[3.5,  7,  5.5,  6.5,  5],
[13.5, 12, 15.5, 16.5, nan]])
2
  • 1
    numpy has masked array, and i'd think you can specify the np.nan being the mask, then apply the averaging operation. Commented Sep 11, 2012 at 4:26
  • +1: question is kind of localized, but at least it's clear and concise with expected input and output. Commented Sep 11, 2012 at 5:00

1 Answer 1

3

This should do the trick:

numpy.ma.average(numpy.ma.masked_invalid(x).reshape(-1, 2, x.shape[-1]), 1)

For me it returns

masked_array(data =
 [[3.5 7.0 5.5 6.5 5.0]
 [13.5 12.0 15.5 16.5 --]],
             mask =
 [[False False False False False]
 [False False False False  True]],
       fill_value = 1e+20)
Sign up to request clarification or add additional context in comments.

6 Comments

Props to yosukesabai for suggesting masked_array
This is exactly what I was looking for, thanks so much. Just started learning Python this summer, still a long way to go!
hmmm, i am getting array([[ 3.5, nan, 5.5, 6.5, nan], [ 13.5, nan, 15.5, 16.5, nan]]), and not sure what's wrong
Added my output to the answer. I don't know why you would be getting different output. I'm using python 2.6 with numpy 1.3.0.
numpy.savetxt doesn't seem to work on masked arrays. Convert back to a normal array with explicit NaN values with x.filled(numpy.NaN), then pass it to numpy.savetxt.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.