4

I have a DataFrame that looks like this:

         Date   Last  portfolioID FinancialInstrument
1   2018-03-28  64.67            1                 Oil
2   2018-03-29  64.91            1                 Oil
3   2018-04-02  62.85            1                 Oil
4   2018-04-03  63.57            1                 Oil
5   2018-04-04  63.56            1                 Oil
6   2018-04-05  63.73            1                 Oil
7   2018-04-06  61.93            1                 Oil
8   2018-03-23  65.74            3                 Oil
9   2018-03-26  65.49            3                 Oil
10  2018-03-27  64.67            3                 Oil
11  2018-03-28  64.67            3                 Oil
12  2018-03-29  64.91            3                 Oil
13  2018-04-02  62.85            3                 Oil
14  2018-04-03  63.57            3                 Oil
15  2018-04-04  63.56            3                 Oil
16  2018-04-05  63.73            3                 Oil
17  2018-04-06  61.93            3                 Oil
18  2018-04-02  62.85            5                 Oil
19  2018-04-03  63.57            5                 Oil
20  2018-04-04  63.56            5                 Oil
21  2018-04-05  63.73            5                 Oil
22  2018-04-06  61.93            5                 Oil

and a NumPy array that looks like this:

[ 152.69506795   76.05719501  127.28719173]

I am grouping the DataFrame using the portfolioID where the first grouping correlates with the first value in the NumPy array and second group with second value in the NumPy array, etc. My question is, is there a way I can multiply the Last column in the DataFrame with its corresponding NumPy array value?

This is what I have but I get an error stating "Length must be equal." shares is the NumPy array:

for pid, group in data.groupby('portfolioID'):
    lastCol = group.Last
    clumN = lastCol.multiply(shares, axis=0)
0

2 Answers 2

4

You can use pandas.Series.factorize to get the indices into your value array, and use these indices to get an appropriate array to multiply by.

val_arr = np.array([152.69506795, 76.05719501, 127.28719173])

df.Last * val_arr[df.portfolioID.factorize()[0]]

# 1     9874.790044
# 2     9911.436861
# 3     9596.885021
# 4     9706.825470
# 5     9705.298519
# 6     9731.256680
# 7     9456.405558
# 8     5000.000000
# 9     4980.985701
# 10    4918.618801
# 11    4918.618801
# 12    4936.872528
# 13    4780.194706
# 14    4834.955887
# 15    4834.195315
# 16    4847.125038
# 17    4710.222087
# 18    8000.000000
# 19    8091.646778
# 20    8090.373906
# 21    8112.012729
# 22    7882.895784
# Name: Last, dtype: float64
Sign up to request clarification or add additional context in comments.

Comments

1

Count the occurrance of each group in the df with count and resize the second array, arr, with np.repeat.

arr = np.array([152.69506795, 76.05719501, 127.28719173])
df.Last * np.repeat(arr, df.groupby("portfolioID")["Last"].count())

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.