Pandas dataframe created from json has unnamed column - can't insert into MySQL due to unnamed column issue

Question

Right now I messing with some JSON data and I am trying to push it into the MySQL database on the fly. The JSON file is enormous so I have to carefully go through it line by line using yield function in Python, convert each JSON line into small pandas DF and write it into MySQL. The problem is that when I create DF from JSON it adds the index column. And it seems that when I write stuff to MySQL it ignores index=False option. Code below

import gzip
import pandas as pd
from sqlalchemy import create_engine

#stuff to parse json file
def parseJSON(path):
  g = open(path, 'r')
  for l in g:
      yield eval(l)
#MySQL engine
engine = create_engine('mysql://login:password@localhost:1234/MyDB', echo=False)
#empty df just to have it
df = {}

for l in parseJSON("MyFile.json"):
    df = pd.DataFrame.from_dict(l, orient='index')
    df.to_sql(name='MyTable', con=engine, if_exists = 'append', index=False)

And I get a error:

OperationalError: (_mysql_exceptions.OperationalError) (1054, "Unknown column '0' in 'field list'")

Any ideas what I am missing? Or is there a way to get around this stuff?

UPD. I see that dataframe has an unnamed column with value 0 each time I create the dataframe in inner loop.

Here is some info about DF:

df
Out[155]: 
                                                                0
reviewerID                                         A1C2VKKDCP5H97
asin                                                   0007327064
reviewerName                                        Donna Polston
helpful                                                    [0, 0]
unixReviewTime                                         1392768000
reviewText      love Oddie ,One of my favorite books are the O...
overall                                                         5
reviewTime                                            02 19, 2014
summary                                                       Wow

print(df.columns)
RangeIndex(start=0, stop=1, step=1)

Sounds like the column names differ from your dataframe to your table. — Bob Haffner
– Bob Haffner, Commented Apr 18, 2017 at 2:56
@BobHaffner, hi, I double-checked that, columns are precisely same. If the column do not exist it would let me know, I believe. I updated the question a bit. — Maksim Khaitovich
– Maksim Khaitovich, Commented Apr 18, 2017 at 2:58
Ok, so they all match except you have an extra column with a value of 0? Can you do a print (df.columns) right before your df.to_sql()? — Bob Haffner
– Bob Haffner, Commented Apr 18, 2017 at 3:01
Ok, now its little more clear. You currently have a frame with one column named 0 with your intended column names as the index of your frame. Perhaps you can try df = pd.DataFrame.from_dict(l) OR you could try df.T.to_sql(name='MyTable', con=engine, if_exists = 'append', index=False) where you tranpose the frame before pushing it to mysql. NOTE: I think you would have much better performance if you could build up a dict (or some other structure), convert all rows to a df then push to mysql. This one row at a time might be too slow? — Bob Haffner
– Bob Haffner, Commented Apr 18, 2017 at 3:20

Bob Haffner · Accepted Answer · 2017-04-19 01:25:59Z

2

You currently have a frame with one column named 0 with your intended column names as the index of your frame. Perhaps you can try

df = pd.DataFrame.from_dict(l)

NOTE: I think you would have much better performance if you could build up a dict (or some other structure), convert all rows to a df then push to mysql. This one row at a time might be too slow

answered Apr 19, 2017 at 1:25

Bob Haffner

8,5231 gold badge40 silver badges44 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Pandas dataframe created from json has unnamed column - can't insert into MySQL due to unnamed column issue

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related