0

I have a numpy array as following:

     array([[1, 2],
            [3, 4],
            [5, 6],
            [7, 8]])

The array is called myArray, and I perform two indexing operations on the 2D array and get following results:

     In[1]: a2 = myArray[1:]
            a2

     Out[1]:array([[3, 4],
                   [5, 6],
                   [7, 8]])


     In[2]: a1 = myArray[:-1]
            a1

     Out[2]:array([[1, 2],
                   [3, 4],
                   [5, 6]])

Now, I perform numpy function to get following results:

     In[]: theta = np.arccos((a1*a2).sum(axis= 1)/(np.sqrt((a1**2).sum(axis= 1)*(a2**2).sum(axis= 1))))
           theta
     Out[]: array([ 0.1798535 ,  0.05123717,  0.02409172])

I perform the same sequence of operations on an equivalent data frame:

    In[]: df = pd.DataFrame(data = myArray, columns = ["x", "y"])
          df
    Out[]: 
         x    y
      0  1    2
      1  3    4
      3  5    6
      4  7    8

   In[]: b2 = df[["x", "y"]].iloc[1:]
   Out[]: b2
            x   y
       1    3   4
       2    5   6
       3    7   8

   In[]: b1 = df[["x", "y"]].iloc[:-1]
         b1
   Out[]: 
            x   y
       0    1   2
       1    3   4
       2    5   6

But now when I am trying to get theta for the data frame, I am only getting 0's and NaN values

      In[]: theta2 = np.arccos((b1*b2).sum(axis= 1)/(np.sqrt((b1**2).sum(axis= 1)*(b2**2).sum(axis= 1))))
            theta2
      Out[]: 
            0    NaN
            1    0.0
            2    0.0
            3    NaN
            dtype: float64

Is it the right way I am applying the numpy functions to indexed data frames ? How should I get the same result for theta when applying it for data frame ?

UPDATE

As suggested below, using b1.values and b2.values works, but now when I am constructing a function, and applying it to the df, I keep getting value error:

       def theta(group):
             b2 = df[["x", "y"]].iloc[1:]
             b1 = df[["x", "y"]].iloc[:-1]

             t = np.arccos((b1.values*b2.values).sum(axis= 1)/
              (np.sqrt((b1.values**2).sum(axis= 1)*(b2.values**2).sum(axis= 1))))

       return t

       df2 = df.apply(theta)

This gives ValueError

       ValueError: Shape of passed values is (2, 3), indices imply (2, 4)

Please let me know where I am wrong.

Thanks in advance.

1
  • @piRSquared Can you please help me with the UPDATE part here. Commented May 8, 2017 at 14:26

1 Answer 1

2

The index of b1 and b2 is not aligned.

If you do:

b2.index=b1.index

np.arccos((b1*b2).sum(axis= 1)/(np.sqrt((b1**2).sum(axis= 1)*(b2**2).sum(axis= 1))))

Should output:

Out[75]: 
0    0.179853
1    0.051237
2    0.024092
dtype: float64

If you don't want to change index, you can call df.values explicitly:

np.arccos((b1.values*b2.values).sum(axis= 1)/(np.sqrt((b1.values**2).sum(axis= 1)*(b2.values**2).sum(axis= 1))))
Sign up to request clarification or add additional context in comments.

4 Comments

Thanks a lot, this is what I was expecting.
@Liza, can you show what's your expected output with your update?
I am sorry for this late reply. I expect the same answers with which u helped me previously i.e array([ 0.1798535 , 0.05123717, 0.02409172]). I am applying the same operations but have created a function theta() and implementing it in that.
df.apply will apply a function row wise or column wise to a dataframe. You can simply call theta('') which will give you the same output. btw, the group parameter is not required.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.