Calculating Average for specific column in a 2D array

Question

I am new to Python and need your help. I need to calculate the average for a specific column in a very large array. I would like to use numpy.average function (open to any other suggestions) but can not figure out a way to select a column by its header (e.g. an average for a Flavor_Score column):

  Beer_name        Tester    Flavor_Score         Overall_Score

  Coors               Jim      2.0                      3.0
  Sam Adams           Dave     4.0                      4.5
  Becks                Jim     3.5                      3.5
  Coors               Dave     2.0                      2.2
  Becks               Dave     3.5                      3.7

Do I have to transpose the array (it seems there are many functions for rows in pandas and numpy but relatively few for columns (I could be wrong, of course) to get average calculations for a column done?

Second question for the same array: is the best way to use the answer from the first question (calculating the average Flavor_Score) to calculate the average Flavor_Score for a specific beer (among different testers)?

Beer-test="Coors"

for i in Beer_Name():

if i=Beer_test: # recurring average calculation
else: pass

I would hope there is a built-in function for this.

Very much appreciate your help!

to calculate average for a specific column do df['Flavor_Score'].mean(), for a specific beer: df[df['Beer_Name'] == 'Coors', 'Flavor_Score'].mean() — EdChum
– EdChum, Commented Sep 4, 2015 at 19:57
@EdChum - The first line of code worked perfectly! Is MEAN function calculating an Average or a Mean? The second line of code produced an error It did not like df [df [ 'Beer_Name']... and I replaced it with df[df.beer_name == "Coors"].Flavor_Score.mean() and it worked! Thank you, EdChum!! — Toly
– Toly, Commented Sep 4, 2015 at 21:19
As an extension of my question: what is the best way to print Average value of Flavor_Score for all unique beers in the list instead of one chosen beer? I have created a list of unique beer names using pd.unique(df.beer_name.ravel()) as an array and then transferred array in the list with Beer_Name_L=Beer_Name_Arr.tolist() (not sure if I needed to do that:)). Now I tried to print using for in in xrange(len(Beer_Name_L)): beername=Beer_Name_L(i) print df[df.beer_name==beername].Flavor_Score.mean()but got an error message "TypeError: 'list' object is not callable — Toly
– Toly, Commented Sep 4, 2015 at 21:42
@Toly Could you please check the answer to close the question ? Thanks in advance. — Romain
– Romain, Commented Sep 15, 2015 at 21:05

Romain · Accepted Answer · 2015-09-05 06:13:43Z

1

Ok here is an example of how to do that.

# Test data
df = pd.DataFrame({'Beer_name': ['Coors', 'Sam Adams', 'Becks', 'Coors','Becks'], 
                   'Tester': ['Jim', 'Dave', 'Jim', 'Dave', 'Dave'], 
                   'Flavor_Score': [2,4,3.5,2,3.5], 
                   'Overall_Score': [3, 4.5, 3.5, 2.2, 3.7]})  
# Simply call mean on the DataFrame
df.mean()

Flavor_Score     3.00
Overall_Score    3.38

Then you can use the groupby feature:

df.groupby('Beer_name').mean()

           Flavor_Score  Overall_Score
Beer_name                             
Becks               3.5            3.6
Coors               2.0            2.6
Sam Adams           4.0            4.5

Now you can even see how it looks like by tester.

df.groupby(['Beer_name','Tester']).mean()

                  Flavor_Score  Overall_Score
Beer_name Tester                             
Becks     Dave             3.5            3.7
          Jim              3.5            3.5
Coors     Dave             2.0            2.2
          Jim              2.0            3.0
Sam Adams Dave             4.0            4.5

Good beer !

edited Sep 5, 2015 at 6:13

answered Sep 4, 2015 at 21:03

Romain

22.2k6 gold badges63 silver badges77 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Toly Over a year ago

Perfect!! Easy an elegant! Thank you!

Romain Over a year ago

Thanks and if my answer fulfill your needs, do not forget to check the it.

Collectives™ on Stack Overflow

Calculating Average for specific column in a 2D array

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related