1

I am using pandas' to_sql method to insert data into a mysql table. The mysql table already exists and I'd like to avoid inserting duplicate rows.

https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.to_sql.html

Is there a way to do this in python?

# mysql connection
import pandas as pd
import pymysql
from sqlalchemy import create_engine
user = 'user1'
pwd = 'xxxx'
host =  'aa1.us-west-1.rds.amazonaws.com'
port = 3306
database = 'main'

engine = create_engine("mysql+pymysql://{}:{}@{}/{}".format(user,pwd,host,database))


con = engine.connect()
df.to_sql(name="dfx", con=con, if_exists = 'append')
con.close()

Are there any work-arounds, if there isn't a straight forward way to do this?

2
  • 1
    If your table isn't exceptionally large you could simply read the whole DB into a df and do a concat with a drop duplicates into your existing code from you post Commented Jul 13, 2022 at 2:55
  • I'd say skip the Pandas layer and go for raw SQL so you can pass the duplicate handling to SQL. Commented Jul 13, 2022 at 3:03

2 Answers 2

1

It sounds like you want to do an "upsert" (insert or update). Pangres is a useful package that will allow you to do an upsert using a pandas df. If you don't want to update the row if it exists, that is also an option by setting if_row_exists to 'ignore'

Sign up to request clarification or add additional context in comments.

Comments

0

I have never heard of 'upsert' before today, but it sounds interesting. You could certainly delete dupes after the data is loaded into your table.

WITH a as
(
SELECT Firstname,ROW_NUMBER() OVER(PARTITION by Firstname, empID ORDER BY Firstname) 
AS duplicateRecCount
FROM dbo.tblEmployee
)
--Now Delete Duplicate Records
DELETE FROM a
WHERE duplicateRecCount > 1

That will work fine, unless you have billions of rows.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.