Value not updated in for loop Python

Question

I am testing the following simple example (see comments in the coding below for background). I have two questions. Thanks.

How come b in bottle is not updated even though the for loop did calculate the right value?
Is there an easier way to do this without using for loop? I heard that using loop can take a lot of time to run when the data is bigger than this simple example.

test = pd.DataFrame(
    [[1, 5],
     [1, 8],
     [1, 9],
     [2, 1],
     [3, 1],
     [4, 1]],
    columns=['a', 'b']
) # Original df
   
bottle = pd.DataFrame().reindex_like(test) # a blank df with the same shape
bottle['a'] = test['a'] # set 'a' in bottle to be the same in test
print(bottle)
   a   b
0  1 NaN
1  1 NaN
2  1 NaN
3  2 NaN
4  3 NaN
5  4 NaN

for index, row in bottle.iterrows():
    row['b'] = test[test['a'] == row['a']]['b'].sum()
    print(row['a'], row['b'])

1.0 22.0
1.0 22.0
1.0 22.0
2.0 1.0
3.0 1.0
4.0 1.0 # I can see for loop is doing what I need.
   
bottle
   a   b
0  1 NaN
1  1 NaN
2  1 NaN
3  2 NaN
4  3 NaN
5  4 NaN # However, 'b' in bottle is not updated by the for loop. Why? And how to fix that?

test['c'] = bottle['b'] # This is the end output I want to get, but not working due to the above. Also is there a way to achieve this without using for loop?

Chrysophylaxs · Accepted Answer · 2022-12-29 23:24:54Z

3

When you iterate over the dataframe's rows, your row variable will be a copy of the current row, local to that for-loop's iteration. When you go to the next iteration, that variable will be deleted, along with the changes you made to it. If you want your for loop to work, you should assign to bottle.loc[index, "b"] instead of to row["b"].

You can complete your task without a for loop by using pandas.DataFrame.groupby and transform as follows:

bottle["b"] = test.groupby("a")["b"].transform("sum")

bottle:

answered Dec 29, 2022 at 23:24

Chrysophylaxs

6,5933 gold badges13 silver badges25 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Mickael Rodrigues Campos · Accepted Answer · 2022-12-30 00:06:33Z

1

The value of b in bottle is not updated because you are not reassigning the value of b in bottle in the loop. Instead, you are only updating the value of b for the current row in the loop.

To fix this, you can modify the code as follows:

for index, row in bottle.iterrows():
    bottle.loc[index, 'b'] = test[test['a'] == row['a']]['b'].sum()

This will update the value of b in the bottle DataFrame for the current row in the loop.

answered Dec 30, 2022 at 0:06

Mickael Rodrigues Campos

111 bronze badge

Collectives™ on Stack Overflow

Value not updated in for loop Python

2 Answers 2

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related