1

I want to do a histogram on a very basic pandas series. For example below, I simply want the x-axis to display "ice-cream", "chocolate", and "coffee", and that the y-axis display 2, 3, 1 (the count). Is this possible? Notice the first column is not in sequential order because I have filtered out NaN values.

print(data_null_false)
45    ice-cream
101   chocolate
102   ice-cream
103   coffee
112   chocolate
120   chocolate

fig, ax = plt.subplots()
ax.hist(rbr_null_false)
plt.show()

Resulted the following errors:

    ---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-28-7d1a5e1bb62b> in <module>()
     28 
     29 fig, ax = plt.subplots()
---> 30 ax.hist(rbr_null_false)
     31 #plt.xlabel('index', fontsize=12);
     32 #plt.ylabel('prod_rollback_date', fontsize=12);

~/anaconda3/lib/python3.5/site-packages/matplotlib/__init__.py in inner(ax, *args, **kwargs)
   1810                     warnings.warn(msg % (label_namer, func.__name__),
   1811                                   RuntimeWarning, stacklevel=2)
-> 1812             return func(ax, *args, **kwargs)
   1813         pre_doc = inner.__doc__
   1814         if pre_doc is None:

~/anaconda3/lib/python3.5/site-packages/matplotlib/axes/_axes.py in hist(self, x, bins, range, normed, weights, cumulative, bottom, histtype, align, orientation, rwidth, log, color, label, stacked, **kwargs)
   5993             xmax = -np.inf
   5994             for xi in x:
-> 5995                 if len(xi) > 0:
   5996                     xmin = min(xmin, xi.min())
   5997                     xmax = max(xmax, xi.max())

TypeError: len() of unsized object

1 Answer 1

4

Though you said you want a histogram, it's actually a bar plot. "A histogram is an accurate graphical representation of the distribution of numerical data." Your example is categorical data. So:

import io

import matplotlib.pyplot as plt
import pandas as pd

data = """45    ice-cream
101 chocolate
102 ice-cream
103 coffee
112 chocolate
120 chocolate"""
df = pd.read_table(io.StringIO(data), header=None)
s = df[1]

s.value_counts().plot(kind='bar')
plt.show()

enter image description here

Sign up to request clarification or add additional context in comments.

1 Comment

Is there a way to avoid the crop/clipping issues? In the attached example both chocolate and cream has been partially cutoff, for slightly longer strings (10-15 characters) it can be very hard to understand what it's actually supposed to be.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.