2

I have a very long list of IDs (IDs are string values. I want to plot a histogram of this list. There are some codes in other threads on stackoverflow for plotting a histogram but the histogram I want should look like this picture (i.e. highest values are in the left side and the values gradually decrease when x-axis increase.

This is the code for plotting regular histogram

import pandas
from collections import Counter
items=a long list of strings
letter_counts = Counter(items)
df = pandas.DataFrame.from_dict(letter_counts, orient='index')
df.plot(kind='bar')

The histogram

2
  • are you asking about how to read in the data into something matplotlib can read or how to format the histogram once you've read the data and converted it to numerical values? Commented Jul 27, 2016 at 20:06
  • @Aaron I edited the question Commented Jul 27, 2016 at 20:10

1 Answer 1

1

how about something along these lines...

from collections import Counter
import matplotlib.pyplot as plt
import numpy as np

counts = Counter(['a','a','a','c','a','a','c','b','b','d', 'd','d','d','d','b'])
common = counts.most_common()
labels = [item[0] for item in common]
number = [item[1] for item in common]
nbars = len(common)

plt.bar(np.arange(nbars), number, tick_label=labels)
plt.show()

The most_common() call is the main innovation of this script. The rest is easily found in the matplotlib documentation (already linked in my comment).

Sign up to request clarification or add additional context in comments.

5 Comments

pyplot has tons of options to make your graph look as pretty as you wish. The documentation has some really nice examples with source code as well.
most_common() removes some elements. For example the original list contains 345000 elements but most_common() returns 43000 elements
Counter subclasses dicts, so you can do: ``` labels = list(counts.keys()); number = list(counts.values()) ```
@HimanUCC the whole idea of using Counter is to reduce the number of elements. most_common() is returning a list of tuples containing each element and how many times it occurred. If there are any identical elements at all, the list returned by most_common will naturally be shorter than the input list.
@story645 counts.keys() would not preserve the order generated from most_common(), which is important if you want the largest columns on the left

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.