0

I am trying to plot a simple bar plot for a keyword vs frequency list. As the data does not have header I am unable to use Pandas or Seabron.

Input

#kyuhyun,1
#therinewyear,4
#lingaa,2
#starts,1
#inox,1
#arrsmultiplex,1
#bollywood,1
#kenya,1
#time,1
#watch,1
#malaysia,3

Code:

from matplotlib import pyplot as plt
from matplotlib import*
import numpy as np 

x,y = np.genfromtxt('theri_split_keyword.csv', delimiter = ',', unpack=True, comments=None, usecols=(0,1))

plt.bar(x,y)

plt.title('Info')
plt.ylabel('Y axis')
plt.xlabel('X axis')

plt.show()

all I am trying to plot is a bar graph with x axis as the keywords and y axis for the frequency. Any easy method to plot this will be huge help.

The Output I am getting is below, which is definitely NOT what I am looking for. enter image description here

The solution below seems to be working like a charm but I have too many keywords in a list and I am looking for a choice like if I can plot only top 10-20 keywords with respective keywords so that the bar plots will look much nicer.

Output of the solution given in answers.

enter image description here

3 Answers 3

1
    import numpy as np
    import matplotlib.pyplot as plt
    import csv

    x = []
    y = []
    with open('theri_split_keyword.csv', "rb") as csvfile:
        reader = csv.reader(csvfile, delimiter=',')
        for row in reader:
            x.append(row[0].lstrip('#'))
            y.append(int(row[1]))

    ind = np.arange(len(x))  # the x locations for the groups
    width = 0.35       # the width of the bars

    fig, ax = plt.subplots()
    plt.bar(ind,y)

    ax.set_ylabel('Y axis')
    ax.set_title('X axis')
    ax.set_xticks(ind + width)
    ax.set_xticklabels(x, rotation='vertical')


    plt.show()
Sign up to request clarification or add additional context in comments.

2 Comments

Hi .. Thank you for the working solution. But I seem to have a problem here the list of keywords is too large is their any way I can take may be top 10 - 20 keywords and the respect frequencies. I am including the final plot in the edit. Pls suggest if thr is any such option to choose top keywords.
@SitzBlogz: If you have an additional question, please post it as a separate question; and if this answered your original question, please accept it.
1

Not answering your question, but pandas does not require data to have a header. If you read data from file, just select header=None (more info here).

df = pd.read_csv(myPath, header=None)
df.columns = ('word','freq') # my cystom header
df.set_index('word') # not neccesary but will provide words as ticks on the plot
df.plot(kind='bar')

you can also pass data as a dictionary, for example

df = pd.DataFrame({'word':['w1','w2','w3'],'freq':[1,2,3})
df.plot.bar()

1 Comment

Could you please help with full code to get some idea about how I can read columns using pandas when I have no headers
0

I'm not familiar with np.genfromtxt but I suspect the problem is that it returns x as an array of strings when x should be numerical.

maybe try something like:

tick_marks = np.arange(len(x))
plt.bar(tick_marks, y)
plt.xticks(tick_marks, x, rotation=45)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.