12

I'm following the SQLAlchemy documentation here to write a bulk upsert statement with Postgres. For demonstration purposes, I have a simple table MyTable:

class MyTable(base):
    __tablename__ = 'mytable'
    id = Column(types.Integer, primary_key=True)
    test_value = Column(types.Text)

Creating a generic insert statement is simple enough:

from sqlalchemy.dialects import postgresql

values = [{'id': 0, 'test_value': 'a'}, {'id': 1, 'test_value': 'b'}]
insert_stmt = postgresql.insert(MyTable.__table__).values(values)

The problem I run into is when I try to add the "on conflict" part of the upsert.

update_stmt = insert_stmt.on_conflict_do_update(
    index_elements=[MyTable.id],
    set_=dict(data=values)
)

Trying to execute this statement yields a ProgrammingError:

from sqlalchemy import create_engine
engine = create_engine('postgres://localhost/db_name')

engine.execute(update_stmt)

>>> ProgrammingError: (psycopg2.ProgrammingError) can't adapt type 'dict'

I think my misunderstanding is in constructing the statement with the on_conflict_do_update method. Does anyone know how to construct this statement? I have looked at other questions on StackOverflow (eg. here) but I can't seem to a way to address the above error.

1 Answer 1

32
update_stmt = insert_stmt.on_conflict_do_update(
    index_elements=[MyTable.id],
    set_=dict(data=values)
)

index_elements should either be a list of strings or a list of column objects. So either [MyTable.id] or ['id'] (This is correct)

set_ should be a dictionary with column names as keys and valid sql update objects as values. You can reference values from the insert block using the excluded attribute. So to get the result you are hoping for here you would want set_={'test_value': insert_stmt.excluded.test_value} (The error you made is that data= in the example isn't a magic argument... it was the name of the column on their example table)

So, the whole thing would be

update_stmt = insert_stmt.on_conflict_do_update(
    index_elements=[MyTable.id],
    set_={'test_value': insert_stmt.excluded.test_value}
)

Of course, in a real world example I usually want to change more then one column. In that case I would do something like...

update_columns = {col.name: col for col in insert_stmt.excluded if col.name not in ('id', 'datetime_created')}
update_statement = insert_stmt.on_conflict_do_update(index_elements=['id'], set_=update_columns)

(This example would overwrite every column except for the id and datetime_created columns)

Sign up to request clarification or add additional context in comments.

6 Comments

Great answer - super clear and easy to follow. Thanks!
This optimised my horrendous alpha code from 90 seconds to 9 seconds. Someone once told me "never execute SQL statements in a loop" and it stuck with me. I knew I had to find a better way, so - cheers!
I keep getting AttributeError: 'Insert' object has no attribute 'excluded' when i'm trying to compose the update_columns. My insert statement looks like this: insert_statement = insert(table).values(items) where table is of type DeclarativeMeta and items is a list of dicts. Any advice would be appreciated.
i figured it out... for anyone with this problem, pay big attention to where you import the insert from. it should be from sqlalchemy.dialects.postgresql import insert not from sqlalchemy import insert. god.
Yes, for database specific constructs like this, you have to import from the dialect directly.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.