2

I am trying to plot a scatter plot for the given input using Pandas. I am getting the following error.

Input:

tweetcricscore  34 #afgvssco   51
tweetcricscore  23 #afgvszim   46
tweetcricscore  24 #banvsire   12
tweetcricscore  456 #banvsned  46
tweetcricscore  653 #canvsnk   1
tweetcricscore  789 #cricket   178
tweetcricscore  625 #engvswi   46
tweetcricscore  86 #hkvssco    23
tweetcricscore  3 #indvsban    1
tweetcricscore  87 #sausvsvic  8
tweetcricscore  98 #wt20       56

Code:

import numpy as np
import matplotlib.pyplot as plt
from pylab import*
import math
from matplotlib.ticker import LogLocator
import pandas as pd

df = pd.read_csv('input.csv', header = None)

df.columns = ['col1','col2','col3','col4']

plt.scatter(x='col2', y='col4', s=120, c='b', label='Highly Active')

plt.legend(loc='upper right')
plt.xlabel('Freq (x)')
plt.ylabel('Freq(y)')
#plt.gca().set_xscale("log")
#plt.gca().set_yscale("log")
plt.show()

From 4 columns I am trying to plot col[1] and col[3] data points as pair in scatter plot.

Error

Traceback (most recent call last):
  File "00_scatter_plot.py", line 14, in <module>
    plt.scatter(x='col2', y='col3', s=120, c='b', label='Highly Active')
  File "/usr/lib/pymodules/python2.7/matplotlib/pyplot.py", line 3087, in scatter
    linewidths=linewidths, verts=verts, **kwargs)
  File "/usr/lib/pymodules/python2.7/matplotlib/axes.py", line 6337, in scatter
    self.add_collection(collection)
  File "/usr/lib/pymodules/python2.7/matplotlib/axes.py", line 1481, in add_collection
    self.update_datalim(collection.get_datalim(self.transData))
  File "/usr/lib/pymodules/python2.7/matplotlib/collections.py", line 185, in get_datalim
    offsets = np.asanyarray(offsets, np.float_)
  File "/usr/local/lib/python2.7/dist-packages/numpy/core/numeric.py", line 514, in asanyarray
    return array(a, dtype, copy=False, order=order, subok=True)
ValueError: could not convert string to float: col2

2 Answers 2

2

This is your error message:

File "00_scatter_plot.py", line 14, in <module>
    plt.scatter(x='col2', y='col3', s=120, c='b', label='Highly Active')
ValueError: could not convert string to float: col2

As you can see you're trying to convert a string "col2" to float.

By taking a look at your code, it seems that you wanna something like this:

plt.scatter(x=df['col2'], y=df['col4'], s=120, c='b', label='Highly Active')

instead of:

plt.scatter(x='col2', y='col4', s=120, c='b', label='Highly Active')
Sign up to request clarification or add additional context in comments.

1 Comment

This used to be an issue in the past (see issue #12380). In the latest matplotlib release (i.e. 3.x) you can use the column name, without having to pass on the Series object.
1

try to change:

plt.scatter(x='col2', y='col4', s=120, c='b', label='Highly Active')

to:

df.plot.scatter(x='col2', y='col4', s=120, c='b', label='Highly Active')

it worked for me:

enter image description here

5 Comments

Great ! Thank you .. Both the solutions work for me.
@SitzBlogz, always glad to help.
As i can select only one answer I would like to select @Dot_py as he has answered first and also he has less stackpoints :) Thank you again for the answer.
@SitzBlogz, sure, no prob! :)
I am trying to solve this program actually .. so you can also help here stackoverflow.com/questions/37147592/…

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.