How to plot a line graph of multiple rows in a Pandas DataFrame [duplicate]

Question

I have a Pandas DataFrame of measurements:

,Fp076,Fp084,Fp092,Fp099,Fp107,Fp115,Fp122,Fp130,Fp143,Fp151,Fp158,Fp166,Fp174,Fp181,Fp189,Fp197,Fp204,Fp212,Fp220,Fp227
0,0.531743,0.512256,0.427771,0.444216,0.332228,0.296139,0.202653,0.298724,0.341529,0.276829,0.24803,0.278406,0.345853,0.317384,0.32032,0.179936,0.205871,0.495948,0.167417,0.097147
1,-0.032964,0.047469,0.128079,0.142839,0.253755,0.165963,0.210111,0.239816,0.162333,0.115085,0.129781,0.134795,0.09575,0.243093,0.10684,0.195201,0.143984,0.266312,0.198049,0.084467
2,0.459728,0.541346,0.830889,0.368135,0.407241,0.499617,0.383159,0.507517,0.409411,0.325441,0.305605,0.378738,0.342981,0.43766,0.295844,0.228164,0.276319,0.226467,0.375678,0.219189
3,2.6838,2.394591,2.493416,0.874906,2.113343,1.812258,1.667047,1.779347,1.515663,1.620196,1.539494,1.63528,1.555373,1.471318,1.610067,1.507087,1.467174,1.458346,1.681998,1.14625
4,0.368415,0.435004,0.155035,0.161064,0.180133,0.202117,0.142981,0.138321,0.122557,0.099213,0.098213,0.062174,0.123664,0.2051,0.167415,0.185133,0.127677,0.037875,0.156252,0.015579
5,0.213577,0.187244,0.274151,0.173572,0.296122,0.308341,0.164578,0.159559,0.318383,0.181329,0.260223,0.257395,0.241779,0.292731,0.244476,0.187523,0.247331,0.293338,0.323894,0.179478
6,0.096093,0.140454,0.067185,6.441058,0.016797,0.141757,0.181792,0.13692,0.204091,0.180182,0.149626,0.220342,0.179286,0.276316,0.104531,0.20343,0.045161,-0.004546,0.045833,0.193849
7,0.286467,0.086673,-0.106538,-0.261802,0.16964,0.182858,0.062774,0.20471,0.040105,0.086975,0.211068,0.182423,0.098721,0.077085,0.102986,0.129935,0.130571,0.176024,0.154079,0.102391
8,0.480631,0.714554,0.858241,0.746666,0.555411,0.452689,0.337912,0.333942,0.269359,0.221312,0.09818,0.226218,0.287361,0.209858,0.222951,0.207584,0.258397,0.026713,0.162048,0.149924
9,1.055405,0.638777,0.468793,0.41544,0.559187,0.471218,0.493805,0.544716,0.412903,0.412182,0.51041,0.383991,0.351397,0.383201,0.368308,0.237954,0.330242,0.262648,0.425204,0.434928
10,1.116658,0.737544,0.854376,-0.004434,0.419419,0.35921,0.377095,0.273815,0.258913,0.290614,0.271843,0.321572,0.234764,0.298931,0.206039,0.192746,0.200727,0.132419,0.229914,0.159857
11,-0.004305,0.052289,0.275035,-0.849414,0.104146,0.185819,0.128376,0.136433,0.091787,0.149753,0.107246,0.081407,0.118816,0.117434,0.169153,0.108273,0.205751,0.145238,0.153086,0.114278
12,0.836223,0.323901,0.269564,0.364082,0.343695,0.386785,0.24881,0.307267,0.222634,0.214189,0.12167,0.251107,0.134083,0.284545,0.175479,0.221877,0.184749,0.225089,0.205388,0.214972

where each row is the flux measurements at the frequencies in the header (76, 84, 92, 99... MHz). I'm trying to plot a line graph of the flux measurements for a row. Since the frequencies in the header are not linear, I've tried this:

f = np.array([76,84,92,99,107,115,122,130,143,151,158,166,174,181,189,197,204,212,220,227])
y1 = [0.531743,0.512256,0.427771,0.444216,0.332228,0.296139,0.202653,0.298724,0.341529,0.276829,0.24803,0.278406,0.345853,0.317384,0.32032,0.179936,0.205871,0.495948,0.167417,0.097147]
y2 = [-0.032964,0.047469,0.128079,0.142839,0.253755,0.165963,0.210111,0.239816,0.162333,0.115085,0.129781,0.134795,0.09575,0.243093,0.10684,0.195201,0.143984,0.266312,0.198049,0.084467]
y3 = [0.459728,0.541346,0.830889,0.368135,0.407241,0.499617,0.383159,0.507517,0.409411,0.325441,0.305605,0.378738,0.342981,0.43766,0.295844,0.228164,0.276319,0.226467,0.375678,0.219189]

fig, ax = plt.subplots()
ax.scatter(f, y1, label = r'$\alpha = -0.37$')
ax.plot(f, y1)
ax.scatter(f, y2, label = r'$\alpha = NaN$')
ax.plot(f, y2)
ax.scatter(f, y3, label = r'$\alpha = -0.75$')
ax.plot(f, y3)
ax.set_xlabel('Frequency (MHz)')
ax.set_ylabel('Flux (Jy/beam)')
ax.grid(which = 'both', axis = 'both')

which is just copy-pasting the first three rows of data, to produce:

That's basically what I want, but what's a better way to do it?

The default pandas histogram plot mode is to plot each column as a separate line plot. If you took the transpose of your dataframe, you'd have each row turn into a column. — Nick ODell
– Nick ODell, Commented Jul 17, 2022 at 0:00
The is the correct way: 1. df.columns = df.columns.str.replace('Fp', '').astype('int'), 2. df = df.T, 3. ax = df.plot(marker='.', figsize=(10, 7), title='Flux per Frequency', ylabel='Flux (Jy/beam)', xlabel='Frequency (MHz)', grid=True ) — Trenton McKinney
– Trenton McKinney, Commented Jul 17, 2022 at 0:13
The hsv colormap was used to add more colors, since there are many observations. There are more colormaps at Choosing Colormaps in Matplotlib — Trenton McKinney
– Trenton McKinney, Commented Jul 17, 2022 at 1:10

Chinny84 · Accepted Answer · 2022-07-17 00:35:04Z

1

There are many ways to solve this problem, but the simplest way (that I can think of) is to pivot your dataframe and then use seaborn to plot all the columns

import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
# convert you sample data
data = [[e for e in row.split(',') if e] for row in data_.split("\n") if row]
columns = data[0]
# create the `x` axis
columns = [int(col.replace('Fp','')) for col in columns]
columns = ['index'] + columns
data = data[1:]
df = pd.DataFrame(data=data, columns=columns)
df = df.drop(columns=['index'])
df = df.astype('float')

This is the example of the dataframe without transforming the headers with int(col.replace('Fp',''))

you can transform your columns as I did above using

df.columns = [int(col.replace('Fp',''))  for col in df.columns]

Once this is done you can do the following pivot

# the pivot of your data
df_ = df.T

# plot your data
plt.figure(figsize=(15,8))
sns.lineplot(data=df_)
plt.title('Example of timeseries plot')
plt.xlabel('Frequency(MHz)')
plt.ylabel('Flux (Jy/beam)')

the output is

You can play around with the various plotting to your desire, but this would be the simplest way (tip - try to leverage as much of the seaborn or pandas plotting methods for this aggregated plots)

edited Jul 17, 2022 at 0:35

answered Jul 16, 2022 at 23:46

Chinny84

9667 silver badges17 bronze badges

Sign up to request clarification or add additional context in comments.

6 Comments

Trenton McKinney Over a year ago

The op states the data already starts in a dataframe. This is not the correct way beginning with the data in the dataframe.

Chinny84 Over a year ago

I know, but sadly the data is not in a pandas dataframe in the question - the first part of the code above is to take what he has and put it in a dataframe. I can be explicit and tell him to change the columns in his dataframe as mine, but apart from that it is consistent.

Trenton McKinney Over a year ago

The first sentence of the op I have a Pandas DataFrame of measurements:

Chinny84 Over a year ago

yes - but I don't have the dataframe, I have to create it from the values he has provided. Did not think I had to explain that - but I have edited to make it clearer.

Trenton McKinney Over a year ago

The answer should not include constructing the dataframe, because that is irrelevant to the question. The OP already starts with a dataframe. The answer should show what to do with the dataframe to create the plot. There is no reason to use seaborn, because the dataframe can be plotted directly, as has already been demonstrated in a comment to the question.

|

Collectives™ on Stack Overflow

How to plot a line graph of multiple rows in a Pandas DataFrame [duplicate]

1 Answer 1

6 Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

6 Comments

Linked

Related