0

I have a data I want to split and convert it to float32 but it is showing the real number as a string

   data = open('Path dataset')
        for line in data:
        train = np.array([np.float32(x) for x in line.split(",")[:]])

And the error that showing me is:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-83-53e8671c416d> in <module>
      1 for line in data_coba:
----> 2     train = np.array([np.float32(x) for x in line.split(",")[:]])          
      3 #print(train_test_coba)

<ipython-input-83-53e8671c416d> in <listcomp>(.0)
      1 for line in data_coba:
----> 2     train = np.array([np.float32(x) for x in line.split(",")[:]])           
      3 #print(train_test_coba)

ValueError: could not convert string to float: '50.89482266'

What is wrong with this?

6
  • it cannot parse this  to convert to float Commented Dec 3, 2019 at 5:59
  • @abhilb yes correct but my data is real numbers it does not contain any of string  Commented Dec 3, 2019 at 6:02
  • may be use regex to remove all unwanted characters Commented Dec 3, 2019 at 6:04
  • change the encoding while opening the file to utf-8-sig , ie. open('Path dataset', encoding='utf-8-sig') Commented Dec 3, 2019 at 6:05
  • You are hit by BOM. See en.wikipedia.org/wiki/Byte_order_mark Commented Dec 3, 2019 at 8:38

2 Answers 2

1

It seems that your dataset contains characters other than just comma-separated numbers. So the error is possibly occurring when it is trying to convert these non-numeric characters to float32. I suggest you check your dataset again and maybe try splitting it more.

Sign up to request clarification or add additional context in comments.

1 Comment

Maybe you can separately check what for x in line.split(",")[:]] gives as output and move further from there.
0

You have to use encoding='utf-8' in the open function

data = open('Path dataset',encoding='utf-8')

2 Comments

when I use this condition, I get this error ValueError: could not convert string to float: '\ufeff51.45440668'
What does the '' signify in the value? If it is ok to ignore those unwanted characters, you can replace these characters. In: np.float32(re.sub(r'[^a-zA-Z0-9_\.]', '', '50.89482266')) Out: 50.89482

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.