1

I'm trying to use df.eval to evaluate an expression that contains a call to a function - in my example it's numpy.around which I have imported in the local namespace. According to the documentation, using @ before the function name should do the trick, but it throws this error

TypeError: 'Series' objects are mutable, thus they cannot be hashed

What am I doing wrong? Doesn't work after fresh run, in IDE (Spyder / Jupyter notebook) or from console.

import numpy as np
import pandas as pd
from numpy import around


df = pd.DataFrame({'x':np.array([1.12,2.76])})

# this throws TypeError: 'Series' objects are mutable, thus they cannot be hashed
df['y'] = df.eval('@around(x,1)')

# this works
df['z'] = around(df['x'],1)

print(pd.__version__)
# 0.23.4

print(np.__version__)
# 1.15.1

import sys
print(sys.version)
# 3.6.6 |Anaconda, Inc.| (default, Jun 28 2018, 11:27:44) [MSC v.1900 64 bit (AMD64)]
6
  • 1
    If you re-run that exact code afresh... does it work? I've just tried your exact code and it works fine... Commented Sep 8, 2018 at 8:58
  • @JonClements I can reproduce numpy 1.14.5, pandas 0.20.3 Commented Sep 8, 2018 at 9:03
  • Works for me with pandas: 0.23.4 and numpy:1.15.1 although I wouldn't expect it not to have worked in earlier versions... Just wondering if there were floating variables/columns left over somehow from a previous run... Commented Sep 8, 2018 at 9:07
  • @JonClements pastebin.com/ZfyhqKPs and that's after a kernel restart so nothing can be left over Commented Sep 8, 2018 at 9:08
  • Bizarre... I get a DF of {'x': {0: 1.12, 1: 2.76}, 'y': {0: 1.1, 1: 2.8}} which looks correct... Commented Sep 8, 2018 at 9:10

1 Answer 1

1

Update The workaround is simpler, no need to remove numexpr package, just use a different parsing engine for the expression:

df['y'] = df.eval('@around(x,1)', engine = 'python')

First answer

I managed to find a workaround and I also figured out where the problem seems to be. First I updated conda itself and all packages (including python to 3.7.0, but I don't believe Python version is relevant). After that:

Step 1: remove pandas and numpy

conda remove pandas numpy

The following packages will be REMOVED:

    bokeh:        0.13.0-py37_0
    mkl_fft:      1.0.4-py37h1e22a9b_1
    mkl_random:   1.0.1-py37h77b88f5_1
    numba:        0.39.0-py37h830ac7b_0
    numexpr:      2.6.8-py37h9ef55f4_0
    numpy:        1.15.1-py37ha559c80_0
    pandas:       0.23.4-py37h830ac7b_0
    scikit-learn: 0.19.1-py37hae9bb9f_0
    scipy:        1.1.0-py37h4f6bf74_1

Step 2: reinstall only pandas and numpy

    conda install pandas numpy

    The following NEW packages will be INSTALLED:

    mkl_fft:    1.0.4-py37h1e22a9b_1  
    mkl_random: 1.0.1-py37h77b88f5_1  
    numpy:      1.15.1-py37ha559c80_0  
    pandas:     0.23.4-py37h830ac7b_0  

After step 2, the code worked as expected, so the problem must lie in one of the other packages removed.

Step 3: add each ofthe packages removed initially one by one (bokeh, numba, numexpr, scikit-learn, scipy) and test each time if the code still works. After installing numexpr the code failed, so that is where the problem is. Not sure how you could add numpexpr back again - I tried some older versions but everytime the code failed

Sign up to request clarification or add additional context in comments.

1 Comment

Good bit of detective work. It's curious as I made an (obviously foolish) assumption that .eval required numexpr because it's required for .query... Checking pd.get_versions() I see I didn't have it installed and presumably why your code worked in the virtualenv I was using. This kind of feels like it should be an issue on the pandas GH...

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.