Python:Fill in missing datetime values in dataframe and fill forward?

Question

Let's say I have a dataframe as:

|       timestamp     | value |
| ------------------- | ----- |
| 01/01/2013 00:00:00 |  2.1  |
| 01/01/2013 00:00:03 |  3.7  |
| 01/01/2013 00:00:05 |  2.4  |

I'd like to have the dataframe as:

|       timestamp     | value |
| ------------------- | ----- |
| 01/01/2013 00:00:00 |  2.1  |
| 01/01/2013 00:00:01 |  2.1  |
| 01/01/2013 00:00:02 |  2.1  |
| 01/01/2013 00:00:03 |  3.7  |
| 01/01/2013 00:00:04 |  3.7  |
| 01/01/2013 00:00:05 |  2.4  |

How do I go about this?

jezrael · Accepted Answer · 2017-05-04 08:52:05Z

21

You can use resample with ffill:

print (df.dtypes)
timestamp     object
value        float64
dtype: object

df['timestamp'] = pd.to_datetime(df['timestamp'])

print (df.dtypes)
timestamp    datetime64[ns]
value               float64
dtype: object

df = df.set_index('timestamp').resample('S').ffill()
print (df)
                     value
timestamp                 
2013-01-01 00:00:00    2.1
2013-01-01 00:00:01    2.1
2013-01-01 00:00:02    2.1
2013-01-01 00:00:03    3.7
2013-01-01 00:00:04    3.7
2013-01-01 00:00:05    2.4

df = df.set_index('timestamp').resample('S').ffill().reset_index()
print (df)
            timestamp  value
0 2013-01-01 00:00:00    2.1
1 2013-01-01 00:00:01    2.1
2 2013-01-01 00:00:02    2.1
3 2013-01-01 00:00:03    3.7
4 2013-01-01 00:00:04    3.7
5 2013-01-01 00:00:05    2.4

edited May 4, 2017 at 8:52

answered May 4, 2017 at 8:42

jezrael

868k103 gold badges1.4k silver badges1.3k bronze badges

Sign up to request clarification or add additional context in comments.

6 Comments

aswa09 Over a year ago

could you tell me why you did 'pd.to_datetime()'? isn't timestamp already in datetime format?

jezrael Over a year ago

Because resample working only with datetime and 01/01/2013 00:00:00 is not datetime, only string repr of datetime

aswa09 Over a year ago

but once you 'resample', timestamp would become the index right? So I'd have to copy the df.index.values to a list, make it a column, and then reindex?

ra67052 Over a year ago

Can I just say I've spent the best part of a whole day on stack and other websites and this is the first solution that has worked, thank you so much :)

Minura Punchihewa Over a year ago

Could someone please tell me why I am getting the following error? cannot reindex a non-unique index with a method or limit

|

CreekGeek · Accepted Answer · 2021-09-04 05:11:14Z

note: if your index were already datetime...

...then attempting to resample will throw an error. You could convert the index back to a column and use @jezreal's answer or calculate a new index with pd.date_range.

Consider df_test with 5 minute data and missing rows:

# create new datetime index based on specified range
daterng_all = pd.date_range(start='2021-08-17 15:00:00', end='2021-08-17 16:30:00', freq="5T")

# create rows with missing intervals and fill missing data
df_test = df_test.reindex(daterng_all, fill_value=np.nan).interpolate()

Above, I've chained interpolate() to fill missing data values, but you could also use .ffill() as @jezreal's answer. Interpolate has more kwargs...it works well for my particular data (environmental time series), i particularly like the 'limit' kwarg so I can set it to ignore gaps that don't make sense to fill that way.

Collectives™ on Stack Overflow

Python:Fill in missing datetime values in dataframe and fill forward?

2 Answers 2

6 Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

6 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related