0

So I have a large csv file with lots of data. The main column 'Results', that I am interested in has integers, float, NaN data types and also number as text. I need to aggregate 'Results' but before I do I want to convert the column to float data type. The values that are text have trailing spaces like the following: ["1.07 ", "8.22 ", "8.6 ", "11.41 ", "7.93 "]

The error I get is...

AttributeError: Can only use .str accessor with string values!

import pandas as pd
import os
import numpy as np

csv_file = 'c:/path/to/file/big.csv'
# ... more lines of code ...

df = pd.read_csv(csv_file, usecols=my_cols, parse_dates=['Date'])
df = df[df['Company ID'].str.contains(my_company)]
print('df of csv created')
# Above code works great. 

# the below 2 tries did not work for me
# df['Result'] = pd.to_numeric(df['Result'].str.replace(' ', ''), errors='ignore')
# df['Result'] = df['Result'].str.strip() # causes an error 

# now let's try np.where...
# the below causes AttributeError: Can only use .str accessor with string values! 
df['Result'] = np.where(df['Result'].dtype == np.str, df['Result'].str.strip(), 
df['Result'])
df['Result'] = pd.to_numeric(df['Result'], downcast="float", errors='raise')

How should I resolve this?

1
  • If I remove the line df['Result'] = np.where(df['Result'].dtype == np.str, df['Result'].str.strip(), df['Result']) I get an error ValueError: Unable to parse string " " at position 1283 Commented Feb 11, 2022 at 0:53

1 Answer 1

1

Why don't you try this code to explicitly convert all the value as stirng using astype(str).

import pandas as pd

df = pd.DataFrame({
    'Result': [' a ', ' b', 'c ']
})

df['Result'] = df['Result'].astype(str).str.strip()
print(df['Result'])

#0    a
#1    b
#2    c
#Name: Result, dtype: object

Sometime, I use this code if NaN or numbers are included in a Series to avoid getting the error msg.

Sign up to request clarification or add additional context in comments.

2 Comments

I did use this method but I modified it, because I got AttributeError: 'Series' object has no attribute 'strip' So I did 2 steps. df['Result'] = df['Result'].astype(str) then df['Result'] = df['Result'].str.strip()
@Shane Oh, sorry I forgot to add 'str inside there. You can do it by one line. I just edited my code with sample dataset. Thank you for pointing out :)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.