4

I am using pandas which very efficiently sorts/filters the data they way I need.

This code worked fine, until I changed the last column to a complex number; now I get an error.

return self._cython_agg_general('mean') raise DataError('No numeric types to aggregate') pandas.core.groupby.DataError: No numeric types to aggregate

The error refers to my eighth column (with the complex numbers) since I want the mean value I cannot find a way to convert the object to a complex number (from what I understand pandas now support complex numbers).

This is the code I use.

import numpy as np
import pandas as pd
df = pd.read_csv('final.dat', sep=",", header=None)
df.columns=['X.1', 'X.2', 'X.3', 'X.4','X.5', 'X.6', 'X.7', 'X.8']
df1 = df.groupby(["X.1","X.2","X.5"])["X.8"].mean().reset_index()

After that I get the error described above.

When I read my file, this is the df output.

<class 'pandas.core.frame.DataFrame'>
Int64Index: 21266 entries, 0 to 21265
Data columns (total 8 columns):
X.1    21266  non-null values
X.2    21266  non-null values
X.3    21266  non-null values
X.4    21266  non-null values
X.5    21266  non-null values
X.6    21266  non-null values
X.7    21266  non-null values
X.8    21266  non-null values
dtypes: float64(4), int64(3), object(1)

This is a small sample of the input file.

4 Answers 4

6

The parse doesn't support reading of complex directly, so do the following transform.

In [37]: df['X.8'] = df['X.8'].str.replace('i','j').apply(lambda x: np.complex(x))

In [38]: df
Out[38]: 
          X.1         X.2  X.3   X.4    X.5  X.6  X.7                X.8
0   564991.15  7371277.89    0     1   1530  0.1    2   (92.289+151.96j)
1   564991.15  7371277.89    0     1   8250  0.1    2   (104.22-43.299j)
2   564991.15  7371277.89    0     1  20370  0.1    2    (78.76-113.52j)
3   564991.15  7371277.89    0     1  33030  0.1    2    (27.141-154.1j)
4   564991.15  7371277.89    0     1  47970  0.1    2     (-30.012-175j)
5   564991.15  7371277.89    0     1  63090  0.1    2  (-118.52-342.43j)
6   564991.15  7371277.89    0     1  93090  0.1    2  (-321.02-1541.5j)
7   564991.15  7371277.89    0     2   1530  0.1    2   (118.73+154.05j)
8   564991.15  7371277.89    0     2   8250  0.1    2   (122.13-45.571j)
9   564991.15  7371277.89    0     2  20370  0.1    2   (93.014-116.03j)
10  564991.15  7371277.89    0     2  33030  0.1    2    (38.56-155.08j)
11  564991.15  7371277.89    0     2  47970  0.1    2  (-20.653-173.83j)
12  564991.15  7371277.89    0     2  63090  0.1    2  (-118.41-340.58j)
13  564991.15  7371277.89    0     2  93090  0.1    2    (-378.71-1554j)
14  564990.35  7371279.17    0  1785   1530  0.1    2   (-15.441+118.3j)
15  564990.35  7371279.17    0  1785   8250  0.1    2  (-7.1735-76.487j)
16  564990.35  7371279.17    0  1785  20370  0.1    2  (-33.847-145.99j)
17  564990.35  7371279.17    0  1785  33030  0.1    2  (-86.035-185.46j)
18  564990.35  7371279.17    0  1785  47970  0.1    2  (-143.37-205.23j)
19  564990.35  7371279.17    0  1785  63090  0.1    2  (-234.67-370.43j)
20  564990.35  7371279.17    0  1785  93090  0.1    2  (-458.69-1561.4j)
21  564990.36  7371279.17    0  1786   1530  0.1    2    (36.129+128.4j)
22  564990.36  7371279.17    0  1786   8250  0.1    2   (39.406-69.607j)
23  564990.36  7371279.17    0  1786  20370  0.1    2   (10.495-139.48j)
24  564990.36  7371279.17    0  1786  33030  0.1    2  (-43.535-178.19j)
25  564990.36  7371279.17    0  1786  47970  0.1    2  (-102.28-196.76j)
26  564990.36  7371279.17    0  1786  63090  0.1    2   (-199.32-362.1j)
27  564990.36  7371279.17    0  1786  93090  0.1    2  (-458.09-1565.6j)

In [39]: df.dtypes
Out[39]: 
X.1       float64
X.2       float64
X.3       float64
X.4         int64
X.5         int64
X.6       float64
X.7         int64
X.8    complex128
dtype: object

In [40]: df1 = df.groupby(["X.1","X.2","X.5"])["X.8"].mean().reset_index()

In [41]:  df.groupby(["X.1","X.2","X.5"])["X.8"].mean().reset_index()
Out[41]: 
          X.1         X.2    X.5                  X.8
0   564990.35  7371279.17   1530     (-15.441+118.3j)
1   564990.35  7371279.17   8250    (-7.1735-76.487j)
2   564990.35  7371279.17  20370    (-33.847-145.99j)
3   564990.35  7371279.17  33030    (-86.035-185.46j)
4   564990.35  7371279.17  47970    (-143.37-205.23j)
5   564990.35  7371279.17  63090    (-234.67-370.43j)
6   564990.35  7371279.17  93090    (-458.69-1561.4j)
7   564990.36  7371279.17   1530      (36.129+128.4j)
8   564990.36  7371279.17   8250     (39.406-69.607j)
9   564990.36  7371279.17  20370     (10.495-139.48j)
10  564990.36  7371279.17  33030    (-43.535-178.19j)
11  564990.36  7371279.17  47970    (-102.28-196.76j)
12  564990.36  7371279.17  63090     (-199.32-362.1j)
13  564990.36  7371279.17  93090    (-458.09-1565.6j)
14  564991.15  7371277.89   1530  (105.5095+153.005j)
15  564991.15  7371277.89   8250    (113.175-44.435j)
16  564991.15  7371277.89  20370    (85.887-114.775j)
17  564991.15  7371277.89  33030    (32.8505-154.59j)
18  564991.15  7371277.89  47970  (-25.3325-174.415j)
19  564991.15  7371277.89  63090  (-118.465-341.505j)
20  564991.15  7371277.89  93090  (-349.865-1547.75j)
Sign up to request clarification or add additional context in comments.

12 Comments

thanks Jeff, this seems to work - I get an error message (that the imaginary part is discarded) but the values appear right!
really, that is weird. also numpy uses j for complex part, FYI
maybe my versions are different? running exactly the same as above here is the error message: /usr/local/lib/python2.7/dist-packages/pandas-0.12.0-py2.7-linux-x86_64.egg/pandas/core/nanops.py:549: ComplexWarning: Casting complex values to real discards the imaginary part x = float(x)
numpy version? (I am using 1.7.1)
same here - numpy 1.7.1 (I just also updated to pandas 12, the latest version, but same thing)
|
4

Or you can parse it directly as a complex number by passing in a converter for that column when you read in the data:

pd.read_csv('final.dat', header=None,
            names=['X.1', 'X.2', 'X.3', 'X.4','X.5', 'X.6', 'X.7', 'X.8'],
            converters={'X.8': lambda s: np.complex(s.replace('i', 'j'))})

2 Comments

thanks, that looks good too, but I still get the same error as above (see my comment to Jeff)
nice approach for the converters. I had seen reference to direct support for complex numbers somewhere else but it did not seem to work
2

For converting all columns, try pd.applymap(lambda s: np.complex(s.replace('i', 'j')))

1 Comment

With the help of this answer, a minimal example can be found hereat
0

I tried implementing the lambda but was getting a error:

ValueError: complex() arg is a malformed string

I found out I had to eliminate the spaces as well as change the 'i' character to 'j' Here's my code:

for tits in df.columns:
    if df[tits].dtypes =='O':
        df[tits] = df[tits].str.replace('i','j')
        df[tits] = df[tits].str.replace(' ','')
        df[tits] = df[tits].apply(lambda x: np.complex(x))
print(df[df.columns[1]])
print(df.dtypes)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.