1

I would like to plot multiple timeseries (one for each value in de column cat) in one plot but haven't worked to ho to do that. My code so far is:

import numpy as np
import pandas as pd

dat = pd.date_range(start='1/1/2018', end='31/12/2018', freq='H')
num = ['A' + str(x).zfill(4) for x in range(len(dat))]
cat = np.random.choice(['A', 'B', 'C', 'D'], len(dat))

df = pd.DataFrame({'date': dat, 'num': num, 'cat':cat}).set_index('date')

print(df.groupby([pd.Grouper(freq='D'), 'cat']).count().unstack().fillna(0).astype(int))

Result:

           num            
cat          A   B   C   D
date                      
2018-01-01   7   3   5   9
2018-01-02   6   3   6   9
2018-01-03  11   3   8   2
2018-01-04   2   6   5  11
2018-01-05   4   8   4   8
2018-01-06   8   8   3   5
2018-01-07   5   8   6   5
2018-01-08   3   8   5   8

I would like to plot different combinations of categories (cat) like (A and B together or C and D together) in one timeseries plot with matplotlib or seaborn but are 'stuck' in de multilevelindexes...

Any suggestions how to select different combinations of columns and plot them? Maybe there is a better way than to unstack the data.

1
  • 1
    If you're using pandas 0.24 + you can chain on .droplevel(0, axis=1) to get rid of redundant index levels Commented Jul 24, 2019 at 12:03

1 Answer 1

1

Yes, better is avoid MultiIndex in columns:

df1 = df.groupby([pd.Grouper(freq='D'), 'cat'])['num'].count().unstack(fill_value=0)

Or:

df1 = df.groupby([pd.Grouper(freq='D'), 'cat']).size().unstack(fill_value=0)

Then plot:

df1[['A','B']].plot()
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.