I am trying to remove all accents in the data. I found a function but I am not able to apply the same on entire dataframe at once.
import unicodedata
import pandas as pd
def remove_accents(input_str):
nfkd_form = unicodedata.normalize('NFKD', input_str)
only_ascii = nfkd_form.encode('ASCII', 'ignore')
return only_ascii
data = {'name': ['Guzmán', 'Molly'],
'year': [2012, 2012]}
df = pd.DataFrame(data)
df
How can I apply the above function?
Is there any parameter in pandas read_csv that I can use to achieve similar output?
apply? Your case looks very straigtfoward. And I do not understand your last question entirely.applydocsunicodedata.normalize('NFKD', input_str)expects two patametersdf.name.apply(lambda x: unicodedata.normalize('NFKD', x).encode('ASCII', 'ignore'))TypeError: normalize() argument 2 must be unicode, not streror. Also, I need to do it on the entire data frame all at once