Python - ValueError: could not convert string to float:

Question

I am trying to make a simple decision tree , but I keep on getting the same ValueError and none of the similar threats was of any help. None of my variables are string but still I am getting an error in conversion.

from pandas import Series, DataFrame
import pandas as pd
import numpy as np
import os
import matplotlib.pylab as plt
from sklearn.cross_validation import train_test_split
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import classification_report
import sklearn.metrics

os.chdir("C:\Mlearning")

"""
Data Engineering and Analysis
"""
#Load the dataset

AH_data = pd.read_csv("gapminder.csv")

data_clean = AH_data.dropna()

#data_clean.dtypes
#data_clean.describe()


"""
Modeling and Prediction
"""
#Split into training and testing sets

predictors = data_clean[['breastcancerper100th','alcconsumption']]

targets = data_clean.employrate

pred_train, pred_test, tar_train, tar_test  =   train_test_split(predictors, targets, test_size=.4)

pred_train.shape
pred_test.shape
tar_train.shape
tar_test.shape

#Build model on training data
classifier=DecisionTreeClassifier()
classifier=classifier.fit(pred_train,tar_train)

predictions=classifier.predict(pred_test)

sklearn.metrics.confusion_matrix(tar_test,predictions)
sklearn.metrics.accuracy_score(tar_test, predictions)

#Displaying the decision tree
from sklearn import tree
#from StringIO import StringIO
from io import StringIO
#from StringIO import StringIO 
from IPython.display import Image
out = StringIO()
tree.export_graphviz(classifier, out_file=out)
import pydotplus
graph=pydotplus.graph_from_dot_data(out.getvalue())
graph.write_pdf("graph.pdf")

But the result that I am getting is this one:

   array = np.array(array, dtype=dtype, order=order, copy=copy)

ValueError: could not convert string to float:

is that error happening in your classifier.fit? or somewhere else? can you post a sample of the data you are trying to classify? — pekapa
– pekapa, Commented Jun 7, 2016 at 17:21
Can you edit your question to show the full traceback? The output of data_clean.dtypes would be useful, too (and perhaps data_clean.head(), if you can share it). — Mark Dickinson
– Mark Dickinson, Commented Jun 7, 2016 at 17:39
It's looks to me as though you're trying to predict a floating-point value (employment rate). That's a regression problem, not a classification problem. Try using DecisionTreeRegressor instead. We'll be able to help much better if you post a traceback, so that we can see which line the ValueError is coming from. — Mark Dickinson
– Mark Dickinson, Commented Jun 8, 2016 at 18:18

Dan Romescu · Accepted Answer · 2017-06-10 03:18:54Z

1

You can use pd.to_numeric (introduced in version 0.17) to convert a column or a Series to a numeric type. The function can also be applied over multiple columns of a DataFrame using apply.

Importantly, the function also takes an errors key word argument that lets you force not-numeric values to be NaN, or simply ignore columns containing these values.

Will work if you will convert al entries to numeric. I use a small function for this:

def convert_column_numeric(ax):
    predictors[ax] = pd.to_numeric(predictors[ax], errors='coerce')

.....

convert_column_numeric('breastcancerper100th')
convert_column_numeric('alcconsumption')`

answered Jun 10, 2017 at 3:18

Dan Romescu

112 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Evan · Accepted Answer · 2016-06-07 17:49:04Z

0

It is most likely a problem with the data. Since you don't have any point in the code where you attempt to convert to float, it must be that the data you have is in a form that prevents it from being read as a number by your parsing commands.

answered Jun 7, 2016 at 17:49

Evan

983 silver badges11 bronze badges

Collectives™ on Stack Overflow

Python - ValueError: could not convert string to float:

2 Answers 2

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related