New column in Pandas dataframe based on value of variable in existing column

Question

I am having difficulties creating a new column with a value that's based on the value of an existing column in that same dataframe. The existing column is numeric and I'm trying give the new column a categorical value of high, medium, low based on something like:

low: < (max-min)/3

med: (max-min)/3 - (max-min)/3 *2

high: > (max-min)/3 *2

Still learning Pandas, so any help is appreciated. Thanks!

EDIT:

This is what I have attempted:

df_unit_day_hour['Level_Score'] = pd.cut(df_unit_day_hour['Level_Score'], q=3, labels=['low', 'medium', 'high'])

I think it's almost what I need, but I'm getting an error (KeyError). Would it be because df_unit_day_hour['Level_Score'] is a float?

Please post raw input data, code to reproduce your df and the desired output, thanks — EdChum
– EdChum, Commented Jun 2, 2015 at 12:32

firelynx · Accepted Answer · 2015-06-02 14:55:14Z

6

Sounds like you want to recreate the Series.cut function

Consider this example below:

import numpy as np
import pandas as pd

df = pd.DataFrame({'val':np.random.choice(10, 10)})
df['cat'] = pd.cut(df['val'], [-1,2,5,10], labels=['low', 'medium', 'high'])
    df

   val   cat
0    6  high
1    2   low
2    7  high
3    7  high
4    8  high
5    8  high
6    9  high
7    6  high
8    2   low
9    0   low

edited Jun 2, 2015 at 14:55

answered Jun 2, 2015 at 12:37

firelynx

32.5k10 gold badges94 silver badges104 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

user1624577 Over a year ago

Thanks for your response. That seemed to put me on the right track, but I'm getting a KeyError. I updated my post to show what I attempted. Thanks again.

firelynx Over a year ago

@user1624577, I updated my example to explain better how to use the cut/qcut functions.

user1624577 Over a year ago

Very much appreciated! I could have done this in SAS in a couple of minutes, but I'm trying to break away from that platform. Thanks again!

Collectives™ on Stack Overflow

New column in Pandas dataframe based on value of variable in existing column

1 Answer 1

3 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

3 Comments

Your Answer

Sign up or log in

Post as a guest

Related