4
df=pd.DataFrame({"a":[1,2,3,[4,5],["apple","pear"]]})
df.replace({[4,5]:4.5})
df.replace({["apple","pear"]:"apple"})

Here I got TypeError. I want to replace specific lists and there is no any regulation between the list which is to be replaced and the object used to replace the list.

3 Answers 3

4

This is not a trivial problem, because DataFrames are not designed to work with mutable objects like lists, sets, or dicts.

You can determine the index of match and replace accordingly.

m = [v == [4, 5] for v in df['a']] 
df.loc[m, 'a'] = 4.5

df
               a
0              1
1              2
2              3
3            4.5
4  [apple, pear]

A similar procedure follows for ['apple', 'pair']. You can form a function from this if you so wish:

def replace(df, col, key, val):
    m = [v == key for v in df[col]]
    df.loc[m, col] = val

replace(df, 'a', [4, 5], 4.5)
replace(df, 'a', ['apple', 'pear'], 'apple')

df
       a
0      1
1      2
2      3
3    4.5
4  apple

Note: The function works in-place.

Sign up to request clarification or add additional context in comments.

1 Comment

v == key for v in df[col] is not the correct way to do key matching in Pandas.
2

There is one way using astype , Even it work , but I still highly recommend you using cold's answer.

df.astype(str).replace({'[4, 5]':4.5,"['apple', 'pear']":"apple"})
Out[159]: 
       a
0      1
1      2
2      3
3    4.5
4  apple

3 Comments

Yeah... I thought about this, but the scope for errors is just too large (quoting issues, etc etc).
Last thing, it converts all items to str, even numeric ones ;-)
@cᴏʟᴅsᴘᴇᴇᴅ yes, that is why I think your answer is the right way for this type of question
0

I had a similar problem, since the places I had listed on a column needed to be standardized. First I tried to give a list as key and the standard word as value, which of course failed. So, I made a function to expand the values on the list as keys, and assign the standard word as value for all of them:

def list_to_dict(cities):
    new_dict = {}
    for key in cities:
        value = cities[key]
        for item in value:
            new_dict[item] = key
    
    return new_dict

With this, I got to clean this list of words aimed to mean Mexico City in Spanish (of course my set is larger and for more places, but this is an illustrative sub-group):

ciudades = list_to_dict({'Ciudad De Mexico' : ['Ciudad De México', 'Cuajimalpa De Morelos', 'Mexicocity', 'Ciudad De  Mexico', 'Miguel Hidalgo, Cdmx', 'Df', 'Cmx', 'Ciudad De M', 'Cdmx', 'Ciudad De M?Xico', 'C.D. M.X,', 'Mx-Cdm', 'Cuidad De Mexico', 'Dif', 'D.F.', 'D.F', 'DF', 'Distrito', 'Mexico City', 'Coyoacan', 'Mx-Cdm', 'Cdmex', 'Mx-Dif', 'Mexico Df', 'Ciudad_De_M']}

Resulting in:

result

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.