1

I've tried replacing the ',' with blank field.

df['amount'].replace(',','', regex=True).astype(float)

Error:

ValueError: could not convert string to float: 

df['amount'] = df['amount'].astype('float64')

I still get the same error. The sample data looks like: 5,000.00 and 1,00,000.234 values.

How can I convert it to float?

Btw, I'm reading a json file! with only the path of the file.

3 Answers 3

1

I think need assign back:

df = pd.DataFrame({'amount':['5,000.00', '1,00,000.234']})

df['amount'] = df['amount'].replace(',','', regex=True).astype('float64')
print (df)
       amount
0    5000.000
1  100000.234

If does not work check if some bad values:

df = pd.DataFrame({'amount':['5,000.00', '1,00,000.234', 'a']})
print (df)
         amount
0      5,000.00
1  1,00,000.234
2             a

print (df.loc[pd.to_numeric(df['amount'].replace(',','', regex=True), errors='coerce').isnull(), 'amount'])
2    a
Name: amount, dtype: object

Then is possible convert bad values to NaNs:

df = pd.DataFrame({'amount':['5,000.00', '1,00,000.234', 'a']})
print (df)
         amount
0      5,000.00
1  1,00,000.234
2             a

df['amount'] = pd.to_numeric(df['amount'].replace(',','', regex=True), errors='coerce', downcast='float')
print (df)

       amount
0    5000.000
1  100000.234
2         NaN

If use pd.read_csv for DataFrame add parameter thousands=',':

df = pd.read_csv(file, thousands=',')
Sign up to request clarification or add additional context in comments.

13 Comments

I'm using json and the 1st answer does not work but the second answer does but it does not convert the datatype to float.
Give me a sec. I explain more.
@jason - I think second paragraph is for check problematic values, then I add solution for repalce them to NaNs. Please check it.
@jason - Try it without it, if working nice, I think you an remove it with no problem.
Cool. Thanks for the heads Up!! :) You are too fast and too good !!! Thanks again!
|
1

Using pandas.to_numeric with pd.Series.str.replace works for this:

s = pd.Series(['5,000.00', '1,00,000.234'])

s = pd.to_numeric(s.str.replace(',', ''), downcast='float')

print(s)

# 0      5000.000
# 1    100000.234
# dtype: float64

However, a better idea is to fix this at source, if possible. For example, pandas.read_csv has arguments which allow you to account for such numeric formatting.

Comments

0

Our values like 5,000.00 would be converted to 5000.00 as a float:

df['Withdrawal Amt.'] = [float(str(i).replace(",", "")) for i in df['Withdrawal Amt.']]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.