0

I'm creating the dataframe df_stats and I want to fill it with variables from each t for t in t_list. When I run this df_stats doesn't populate with values but if I run the line df_stats.append({... independantly it populates one row of data with the values if the current t. What am I missing to populate df_statswith a row of data from each t1 int_list`?

import pandas as pd
import matplotlib.pyplot as plt
from datetime import datetime, timedelta
import glob

#add all data files into large df so all dates are acessible
path = 'C:\Users\data' # use your path
all_files = glob.glob(path + "/*.csv")
li = []
for filename in all_files:
    df_data = pd.read_csv(filename, index_col=None, header=0)
    li.append(df_data)

df_data = pd.concat(li, axis=0, ignore_index=True)
df_data['datetime'] = pd.to_datetime(df_data['TimeStamp'] )
df = df_data[(df_data['datetime']>= datetime(2017, 11,9, 00,00, 00)) &
         (df_data['datetime']< datetime(2017, 11, 9, 23,50, 00))]

##want a time array for all of the datetimes in the df
t_list = df.groupby("datetime").all().index

df_stats = pd.DataFrame(columns = ['t', 'min_ws', 'max_ws', 'mean_ws','stdev_ws',
 'TI_var_ws', 'min_power', 'max_power', 'mean_power', 'stdev_pwr', 'TI_var_pwr'])

for t in t_list:
    df_t = df[(df['datetime']>=t) & (df['datetime']<t_end)]

    #calc min/max for setting scale on images
    t = t
    min_ws = df['wtc_AcWindSp_mean'].min()
    max_ws = df['wtc_AcWindSp_mean'].max()
    mean_ws = df['wtc_AcWindSp_mean'].mean()
    stdev_ws = df['wtc_AcWindSp_mean'].std()
    TI_var_ws = stdev_ws/mean_ws

    min_power = df['wtc_ActPower_mean'].min()
    max_power = df['wtc_ActPower_mean'].max()
    mean_power = df_t['wtc_ActPower_mean'].mean()
    stdev_pwr = df_t['wtc_ActPower_mean'].std()
    TI_var_pwr = stdev_pwr/mean_power

    df_stats.append({'t':t, 'min_ws':min_ws, 'max_ws':max_ws, 'mean_ws':mean_ws,'stdev_ws':stdev_ws,
    'TI_var_ws':TI_var_ws, 'min_power':min_power,...'max_power':max_power, 'mean_power': mean_power,
    'stdev_pwr':stdev_pwr, 'TI_var_pwr':TI_var_pwr}, ignore_index=True)
2

1 Answer 1

2

You need to reassign the DataFrame as append always returns a new object:

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.append.html

df_stats = df_stats.append({'t':t, 'min_ws':min_ws, 'max_ws':max_ws, 'mean_ws':mean_ws,'stdev_ws':stdev_ws,
    'TI_var_ws':TI_var_ws, 'min_power':min_power,...'max_power':max_power, 'mean_power': mean_power,
    'stdev_pwr':stdev_pwr, 'TI_var_pwr':TI_var_pwr}, ignore_index=True)

That said, you are likely better off creating the index from scratch, something like:

# Pass an index argument
df_stats = pd.DataFrame(index=range(len(t_list)), columns = ['t', 'min_ws', 'max_ws', 'mean_ws','stdev_ws',
 'TI_var_ws', 'min_power', 'max_power', 'mean_power', 'stdev_pwr', 'TI_var_pwr'])

# ...

for i, t in enumerate(t_list):

    # ...

    df.iloc[i] = {'t':t, 'min_ws':min_ws, 'max_ws':max_ws, 'mean_ws':mean_ws,'stdev_ws':stdev_ws,
    'TI_var_ws':TI_var_ws, 'min_power':min_power,...'max_power':max_power, 'mean_power': mean_power,
    'stdev_pwr':stdev_pwr, 'TI_var_pwr':TI_var_pwr}
Sign up to request clarification or add additional context in comments.

1 Comment

Thank you for your helpful answer, I chose the iloc method and forgot to accept the answer, apologies!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.