2

I have a example dataframe like this:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

df = pd.DataFrame({'a':[0.05, 0.11, 0.18, 0.20, 0.22, 0.27],
                  'b':[3.14, 1.56, 33.10, 430.00, 239.10, 2600.22]})

enter image description here

I would like to plot these properties as a scatter plot and then show the linear tendency line of these samples. And I need to put the data on the y axis (df['b']) on log scale.

Although, when I try to do that using the aid of np.polyfit, I get a strange line.

# Coefficients for polynomial function (degree 1) 
coefs = np.polyfit(df['a'], df['b'], 1)
fit_coefs = np.poly1d(coefs)

plt.figure()
plt.scatter(df['a'], df['b'], s = 50, edgecolors = 'black') 
plt.plot(df['a'], fit_coefs(df['a']), color='red',linestyle='--')
plt.xlabel('a') 
plt.ylabel('b')
plt.yscale('log')

enter image description here

And if I convert df['b] to log before the plot, I can get the right linear tendency, but I would like to show the y-axis with the values of the last plot and not as converted log values as this one below:

df['b_log'] = np.log10(df['b'])

coefs = np.polyfit(df['a'], df['b_log'], 1)
fit_coefs = np.poly1d(coefs)

plt.figure()
plt.scatter(df['a'], df['b_log'], s = 50, edgecolors = 'black') 
plt.plot(df['a'], fit_coefs(df['a']), color='red', linestyle='--') 
plt.xlabel('a') 
plt.ylabel('b_log')

enter image description here

So basically, I need a plot like the last one, but the values on y-axis should be like the second plot and I still would get the right linear tendency. Anyone could help me?

1 Answer 1

2

You are doing two different things there: First, you are fitting a linear curve to your exponential data (which is presumably not what you want), then you are fitting a linear curve to your log data, which is ok.

In order to get the linear curve from the linear coefficients in the logarithmic plot, you can just do 10**fit_coefs(df['a']):

df['b_log'] = np.log10(df['b'])

coefs = np.polyfit(df['a'], df['b_log'], 1)
fit_coefs = np.poly1d(coefs)

plt.figure()
plt.scatter(df['a'], df['b'], s = 50, edgecolors = 'black') 
plt.plot(df['a'], 10**fit_coefs(df['a']), color='red', linestyle='--') 
plt.xlabel('a')
plt.ylabel('b_log')
plt.yscale("log")
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.