Bulk Upsert with SQLAlchemy Postgres

Question

I'm following the SQLAlchemy documentation here to write a bulk upsert statement with Postgres. For demonstration purposes, I have a simple table MyTable:

class MyTable(base):
    __tablename__ = 'mytable'
    id = Column(types.Integer, primary_key=True)
    test_value = Column(types.Text)

Creating a generic insert statement is simple enough:

from sqlalchemy.dialects import postgresql

values = [{'id': 0, 'test_value': 'a'}, {'id': 1, 'test_value': 'b'}]
insert_stmt = postgresql.insert(MyTable.__table__).values(values)

The problem I run into is when I try to add the "on conflict" part of the upsert.

update_stmt = insert_stmt.on_conflict_do_update(
    index_elements=[MyTable.id],
    set_=dict(data=values)
)

Trying to execute this statement yields a ProgrammingError:

from sqlalchemy import create_engine
engine = create_engine('postgres://localhost/db_name')

engine.execute(update_stmt)

>>> ProgrammingError: (psycopg2.ProgrammingError) can't adapt type 'dict'

I think my misunderstanding is in constructing the statement with the on_conflict_do_update method. Does anyone know how to construct this statement? I have looked at other questions on StackOverflow (eg. here) but I can't seem to a way to address the above error.

Paul Becotte · Accepted Answer · 2019-03-27 03:51:19Z

32

update_stmt = insert_stmt.on_conflict_do_update(
    index_elements=[MyTable.id],
    set_=dict(data=values)
)

index_elements should either be a list of strings or a list of column objects. So either [MyTable.id] or ['id'] (This is correct)

set_ should be a dictionary with column names as keys and valid sql update objects as values. You can reference values from the insert block using the excluded attribute. So to get the result you are hoping for here you would want set_={'test_value': insert_stmt.excluded.test_value} (The error you made is that data= in the example isn't a magic argument... it was the name of the column on their example table)

So, the whole thing would be

update_stmt = insert_stmt.on_conflict_do_update(
    index_elements=[MyTable.id],
    set_={'test_value': insert_stmt.excluded.test_value}
)

Of course, in a real world example I usually want to change more then one column. In that case I would do something like...

update_columns = {col.name: col for col in insert_stmt.excluded if col.name not in ('id', 'datetime_created')}
update_statement = insert_stmt.on_conflict_do_update(index_elements=['id'], set_=update_columns)

(This example would overwrite every column except for the id and datetime_created columns)

answered Mar 27, 2019 at 3:51

Paul Becotte

9,9973 gold badges36 silver badges43 bronze badges

Sign up to request clarification or add additional context in comments.

6 Comments

user144153 Over a year ago

Great answer - super clear and easy to follow. Thanks!

deed02392 Over a year ago

This optimised my horrendous alpha code from 90 seconds to 9 seconds. Someone once told me "never execute SQL statements in a loop" and it stuck with me. I knew I had to find a better way, so - cheers!

Mike Over a year ago

I keep getting AttributeError: 'Insert' object has no attribute 'excluded' when i'm trying to compose the update_columns. My insert statement looks like this: insert_statement = insert(table).values(items) where table is of type DeclarativeMeta and items is a list of dicts. Any advice would be appreciated.

Mike Over a year ago

i figured it out... for anyone with this problem, pay big attention to where you import the insert from. it should be from sqlalchemy.dialects.postgresql import insert not from sqlalchemy import insert. god.

Paul Becotte Over a year ago

Yes, for database specific constructs like this, you have to import from the dialect directly.

|

Collectives™ on Stack Overflow

Bulk Upsert with SQLAlchemy Postgres

1 Answer 1

6 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

6 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related