1

I have following data frame in pandas that contains array reading it directly from a sqlite db using pd.read_sql():

      ArrayID  Value
 0        0      0
 1        0      1
 2        0      2
 3        0      3
 4        0      4
 5        0      5
 6        1      0
 7        1      1
 8        1      2
 9        1      3

I would like to know a fast way to get the arrays so I can plot it:

Array0 [0,1,2,3,4,5]

Array1 [0,1,2,3]

The only way I could think was (really slow when the table has 1000 arrays with arrays varying on length having maxixum length of 500):

import pandas as pd    
import matplotlib.pyplot as plt

# loop on
for id in df.ArrayID:
    array = df.loc[df["ArrayID"]==id, "Value"].values()
    plt.plot(array)

plt.show()

Or is the matplotlib beeing the issue?

1 Answer 1

1

Use groupby to obtain the groups in one call, (instead of many calls to df.loc and df['ArrayID'] == id):

for aid, grp in df.groupby(['ArrayID']):
    plt.plot(grp['Value'].values) 

Note also that plt.plot is not very fast. Calling it 1000 times may feel pretty slow. Moreover, a plot with 1000 lines may not look very comprehensible. You may need to rethink the quantity (perhaps through clustering or aggregation) that you wish to visualize.


import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

N, M = 500, 1000
data = np.row_stack([np.column_stack(np.broadcast_arrays(i, 
    (np.random.random(np.random.randint(N))-0.5).cumsum())) for i in range(M)])
df = pd.DataFrame(data, columns=['ArrayID', 'Value'])
for aid, grp in df.groupby(['ArrayID']):
    plt.plot(grp['Value'].values) 
plt.show()

enter image description here

Sign up to request clarification or add additional context in comments.

2 Comments

thank you, it worked well for my task! I just changed plt.plot(grp["Value"]) to plt.plot(grp["Value"].values) so the x-axis is always starting from 0 for each array. Would you recomend other solution for faster plot in python? I know the amount of line is to high for a plot. But I wonder if it is possible to plot faster in python.
Sorry I don't have a better suggestion, though running the code above didn't seem too terribly slow.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.