Multiplying Dataframe rows with numpy array

Question

I have a DataFrame that looks like this:

         Date   Last  portfolioID FinancialInstrument
1   2018-03-28  64.67            1                 Oil
2   2018-03-29  64.91            1                 Oil
3   2018-04-02  62.85            1                 Oil
4   2018-04-03  63.57            1                 Oil
5   2018-04-04  63.56            1                 Oil
6   2018-04-05  63.73            1                 Oil
7   2018-04-06  61.93            1                 Oil
8   2018-03-23  65.74            3                 Oil
9   2018-03-26  65.49            3                 Oil
10  2018-03-27  64.67            3                 Oil
11  2018-03-28  64.67            3                 Oil
12  2018-03-29  64.91            3                 Oil
13  2018-04-02  62.85            3                 Oil
14  2018-04-03  63.57            3                 Oil
15  2018-04-04  63.56            3                 Oil
16  2018-04-05  63.73            3                 Oil
17  2018-04-06  61.93            3                 Oil
18  2018-04-02  62.85            5                 Oil
19  2018-04-03  63.57            5                 Oil
20  2018-04-04  63.56            5                 Oil
21  2018-04-05  63.73            5                 Oil
22  2018-04-06  61.93            5                 Oil

and a NumPy array that looks like this:

[ 152.69506795   76.05719501  127.28719173]

I am grouping the DataFrame using the portfolioID where the first grouping correlates with the first value in the NumPy array and second group with second value in the NumPy array, etc. My question is, is there a way I can multiply the Last column in the DataFrame with its corresponding NumPy array value?

This is what I have but I get an error stating "Length must be equal." shares is the NumPy array:

for pid, group in data.groupby('portfolioID'):
    lastCol = group.Last
    clumN = lastCol.multiply(shares, axis=0)

miradulo · Accepted Answer · 2018-04-09 03:38:37Z

4

You can use pandas.Series.factorize to get the indices into your value array, and use these indices to get an appropriate array to multiply by.

val_arr = np.array([152.69506795, 76.05719501, 127.28719173])

df.Last * val_arr[df.portfolioID.factorize()[0]]

# 1     9874.790044
# 2     9911.436861
# 3     9596.885021
# 4     9706.825470
# 5     9705.298519
# 6     9731.256680
# 7     9456.405558
# 8     5000.000000
# 9     4980.985701
# 10    4918.618801
# 11    4918.618801
# 12    4936.872528
# 13    4780.194706
# 14    4834.955887
# 15    4834.195315
# 16    4847.125038
# 17    4710.222087
# 18    8000.000000
# 19    8091.646778
# 20    8090.373906
# 21    8112.012729
# 22    7882.895784
# Name: Last, dtype: float64

answered Apr 9, 2018 at 3:38

miradulo

29.8k7 gold badges86 silver badges97 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Tai · Accepted Answer · 2018-04-09 03:55:45Z

1

Count the occurrance of each group in the df with count and resize the second array, arr, with np.repeat.

arr = np.array([152.69506795, 76.05719501, 127.28719173])
df.Last * np.repeat(arr, df.groupby("portfolioID")["Last"].count())

answered Apr 9, 2018 at 3:55

Tai

8,0643 gold badges31 silver badges50 bronze badges

Collectives™ on Stack Overflow

Multiplying Dataframe rows with numpy array

2 Answers 2

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related