I would like to create a Series in pandas from a DataFrame that I have.
The DataFrame has 3 columns: 'date', 'time' and 'frequ'. I would like that the first two columns ('date' and 'time') would be the index of the new Series.
Unfortunately, the data which I have contains missing values. So when I try to convert to Series I have a problem to specify the index. Normally, if I wouldn't have missing values, I would use:
index = pd.data_range(start = df.date[0], end = '2015/03/06 17:07:05', freq = 'S') in the pd.Series( ) function.
But if I do that in my example, then I get an error because the length of the new index is longer than the actual one (the new has no missing values).
So here is a small sample of my DataFrame:
Out[2]:
date time frequ
0 2015/03/06 17:06:26 50.091
1 2015/03/06 17:06:27 50.087
2 2015/03/06 17:06:29 50.084
3 2015/03/06 17:06:30 50.083
4 2015/03/06 17:06:31 50.082
.. ... ... ...
33 2015/03/06 17:07:03 50.079
34 2015/03/06 17:07:04 50.078
35 2015/03/06 17:07:05 50.077
(So as can be seen, the value and time at 2015/03/06 17:06:28 is missing)
This is how the Series (ts) should look like more or less:
2015/03/06 17:06:26 50.091
2015/03/06 17:06:27 50.087
2015/03/06 17:06:29 50.084
2015/03/06 17:06:30 50.083
2015/03/06 17:06:31 50.082
... ... ...
2015/03/06 17:07:03 50.079
2015/03/06 17:07:04 50.078
2015/03/06 17:07:05 50.077
again, in this outcome the first two columns are the index
so if I will call for example:
In[3]: ts['2015/03/06 17:06:26': '2015/03/06 17:06:29']
i'll get:
out[3]:
2015/03/06 17:06:26 50.091
2015/03/06 17:06:27 50.087
2015/03/06 17:06:29 50.084
Freq: S, dtype: float64
Finally, here is the code that I wrote:
import pandas as pd
data = {'frequ': sum_freq, 'time': sum_time, 'date': date_list}
df = pd.DataFrame(data, columns = ['date', 'time', 'frequ'])
ts = pd.Series(df.frequ.values, index = ???)
Does anybody have an idea how to overcome this problem?
Thanks!!!
(I use Python 2.7.6)