1

I am trying to insert pandas dataframe into a postgres table as fast as possible. I have tried out executemany but it's too slow. It was taking 6 hours for inserting 24000 rows. Most online solutions suggested that psycopg2's copy_from is a faster method. I am using the following code to insert dataframe into postgres table.

f = io.StringIO()
df.to_csv(f, index=False, header=False)
f.seek(0)

con = psycopg2.connect(**config_dev)
cur = con.cursor()
cur.copy_from(f, 'table', null="", sep=",")
con.commit()
cur.close()

But I am getting this error i.e. psycopg2.DataError: invalid input syntax for integer: "10% off on shoulder bags" CONTEXT: COPY table, line 1, column coupon_id: "10% off on shoulder bags". Here the text from the next column is trying to get inserted into coupon_id which is an integer. This is because my first column in dataframe is being inserted to primary_key id in the postgres table which messes up the columns and hence the data error. Does anyone has a solution for it?

3
  • stackoverflow.com/questions/35275754/… Commented Aug 22, 2017 at 10:35
  • Thank you. Can you explain the SQL copy query that was used? I mean the delimiter is , and the file type would be csv? Commented Aug 22, 2017 at 10:44
  • 1
    The solution didn't work. It gives me error even after settings permission psycopg2.ProgrammingError: must be superuser to COPY to or from a file HINT: Anyone can COPY to stdout or from stdin. psql's \copy command also works for anyone Commented Aug 22, 2017 at 11:46

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.