Getting 'DataFrame' objects are mutable, thus they cannot be hashed error while to created nested array dataframe

Question

I am trying to create nested array of array inside a pandas dataframe column df[link] and attach back to original datframe. What's wrong with my code and how to fix this?

Error:

TypeError

Traceback (most recent call last)

in
2 df2['shipmentNumber'] = df2.shipmentID.str.split('-',1).str[-1]
3 df2['link'] = pd.DataFrame({'link': df2.to_dict('records')})
----> 4 result['link'] = df2.groupby(df2.index).agg(list)['link']

c:\users\ashok.eapen\pycharmprojects\rs-components\venv\lib\site-packages\pandas\core\groupby\generic.py in aggregate(self, func, engine, engine_kwargs, *args, **kwargs)
940 def aggregate(self, func=None, *args, engine=None, engine_kwargs=None, **kwargs):
941
--> 942 relabeling, func, columns, order = reconstruct_func(func, **kwargs)
943
944 if maybe_use_numba(engine):

TypeError: 'DataFrame' objects are mutable, thus they cannot be hashed

My input:

df6

ShipmentID                                                                             CustomerCode  
['USWPR04-20210429-S-00001', 'USWPR04-20210429-S-00002','USWPR04-20210429-S-00006']    USWPR04
['MSLPR04-20210429-S-00001', 'MSLPR04-20210429-S-00002']                               MSLPR04

My code:

df2= df6.explode('shipmentID')
df2['shipmentNumber'] = df2.shipmentID.str.split('-',1).str[-1]
df2['link'] = pd.DataFrame({'link': df2.to_dict('records')})
result['link']  = df2.groupby(df2.index).agg(list)['link']

Expected output column:

df['LinkID']

[{ "shipID": "USWPR04-20210429-S-00001", "customerCode": "USWPR04", "shiNumber": "20210429-S-00001" },
 { "shipID": "USWPR04-20210429-S-00002", "customerCode": "USWPR04", "shipNumber": "20210429-S-00002" },
 { "ShipID": "USWPR04-20210429-S-00002", "customerCode": "USWPR04", "shipNumber": "20210429-S-00006" }]

[{ "shipID": "MSLPR04-20210429-S-00001", "customerCode": "MSLPR04", "shiNumber": "20210429-S-00001" },
{ "shipID": "MSLPR04-20210429-S-00002", "customerCode": "MSLPR04", "shipNumber": "20210429-S-00002" }]

Expected final dataframe:

ShipID                                                                             CustomerCode   link
['USWPR04-20210429-S-00001', 'USWPR04-20210429-S-00002','USWPR04-20210429-S-00006']    USWPR04    [{ "shipID": "USWPR04-20210429-S-00001", "customerCode": "USWPR04", "shiNumber": "20210429-S-00001" },{ "shipID": "USWPR04-20210429-S-00002", "customerCode": "USWPR04", "shipNumber": "20210429-S-00002" },{ "ShipID": "USWPR04-20210429-S-00002", "customerCode": "USWPR04", "shipNumber": "20210429-S-00006" }]
['MSLPR04-20210429-S-00001', 'MSLPR04-20210429-S-00002']                               MSLPR04    [{ "shipID": "MSLPR04-20210429-S-00001", "customerCode": "MSLPR04", "shiNumber": "20210429-S-00001" },{ "shipID": "MSLPR04-20210429-S-00002", "customerCode": "MSLPR04", "shipNumber": "20210429-S-00002" }]

jezrael · Accepted Answer · 2021-07-08 07:19:30Z

1

Use:

#add df2.index first
df2['link'] = pd.DataFrame({'link': df2.to_dict('records')}, index=df2.index)
#assign to `df6`
df6['link'] = df2.groupby(df2.index)['link'].agg(list)

Or instead your solution list comprehension:

df6['link1'] = [[{'shipID':x, 'CustomerCode':b, 'shipmentNumber': x.split('-',1)[-1]} 
                for x in a] for a, b in zip(df6['ShipmentID'],df6['CustomerCode'])]

Output are same:

print (df6['link'] == df6['link1'])
0    True
1    True
dtype: bool

edited Jul 8, 2021 at 7:19

answered Jul 8, 2021 at 7:03

jezrael

868k103 gold badges1.4k silver badges1.3k bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Getting 'DataFrame' objects are mutable, thus they cannot be hashed error while to created nested array dataframe

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related