Appending pandas dataframes in for loop

Question

The problem with the code below is that df is not appended by new DataFrame. When I print the shape it is still (1,6). How can I fix it?

columns = ['name', 'precision', 'recall', 'gmean', 'f1', 'mse']
df_SMOTE = pd.DataFrame(columns=columns )
df_ENN = pd.DataFrame(columns=columns )
df_Ensemble = pd.DataFrame(columns=columns )

for name, model in zip(names, [rfc, knc, lr, svc, dtc, xgbc, cbc, lgbc]):
    for X, y, df in [(X_smote, y_smote, df_SMOTE), (X_enn, y_enn, df_ENN), (X_smote, y_smote, df_Ensemble)]:
        learner = Learner(model, X, y)
        learner()
        precision, recall, gmean, f1, mse = learner.get_metrics()
        df = pd.concat([df, pd.DataFrame({'name': [name], 'precision': [precision], 'recall': [recall], 'gmean': [gmean], 'f1': [f1], 'mse': [mse]})], ignore_index=True)
        
        print(df.shape)

boxblox · Accepted Answer · 2023-01-26 01:09:30Z

1

It would be faster if you collected each individual dataframe in a list and then did the concat after. Sort of like this:

columns = ["name", "precision", "recall", "gmean", "f1", "mse"]
dfs = []
for i in range(100):
    dfs.append(
        pd.DataFrame(
            data=[
                (f"n{i}", f"p{i}", f"r{i}", f"g{i}", f"f{i}", f"m{i}")
                for i in np.random.randint(0, 10000, size=(5,))
            ],
            columns=columns,
        )
    )
df = pd.concat(dfs, ignore_index=True)

answered Jan 26, 2023 at 1:09

boxblox

214 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Appending pandas dataframes in for loop

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related