Writing JSON column to Postgres using Pandas .to_sql

Question

During an ETL process I needed to extract and load a JSON column from one Postgres database to another. We use Pandas for this since it has so many ways to read and write data from different sources/destinations and all the transformations can be written using Python and Pandas. We're quite happy with the approach to be honest.. but we hit a problem.

Usually it's quite easy to read and write the data. You just use pandas.read_sql_table to read the data from the source and pandas.to_sql to write it to the destination. But, since one of the source tables had a column of type JSON (from Postgres) the to_sql function crashed with the following error message.

    df.to_sql(table_name, analytics_db)
  File "/home/ec2-user/python-virtual-environments/etl/local/lib64/python2.7/site-packages/pandas/core/generic.py", line 1201, in to_sql
    chunksize=chunksize, dtype=dtype)
  File "/home/ec2-user/python-virtual-environments/etl/local/lib64/python2.7/site-packages/pandas/io/sql.py", line 470, in to_sql
    chunksize=chunksize, dtype=dtype)
  File "/home/ec2-user/python-virtual-environments/etl/local/lib64/python2.7/site-packages/pandas/io/sql.py", line 1147, in to_sql
    table.insert(chunksize)
  File "/home/ec2-user/python-virtual-environments/etl/local/lib64/python2.7/site-packages/pandas/io/sql.py", line 663, in insert
    self._execute_insert(conn, keys, chunk_iter)
  File "/home/ec2-user/python-virtual-environments/etl/local/lib64/python2.7/site-packages/pandas/io/sql.py", line 638, in _execute_insert
    conn.execute(self.insert_statement(), data)
  File "/home/ec2-user/python-virtual-environments/etl/local/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 945, in execute
    return meth(self, multiparams, params)
  File "/home/ec2-user/python-virtual-environments/etl/local/lib64/python2.7/site-packages/sqlalchemy/sql/elements.py", line 263, in _execute_on_connection
    return connection._execute_clauseelement(self, multiparams, params)
  File "/home/ec2-user/python-virtual-environments/etl/local/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 1053, in _execute_clauseelement
    compiled_sql, distilled_params
  File "/home/ec2-user/python-virtual-environments/etl/local/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 1189, in _execute_context
    context)
  File "/home/ec2-user/python-virtual-environments/etl/local/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 1393, in _handle_dbapi_exception
    exc_info
  File "/home/ec2-user/python-virtual-environments/etl/local/lib64/python2.7/site-packages/sqlalchemy/util/compat.py", line 202, in raise_from_cause
    reraise(type(exception), exception, tb=exc_tb, cause=cause)
  File "/home/ec2-user/python-virtual-environments/etl/local/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 1159, in _execute_context
    context)
  File "/home/ec2-user/python-virtual-environments/etl/local/lib64/python2.7/site-packages/sqlalchemy/engine/default.py", line 459, in do_executemany
    cursor.executemany(statement, parameters)
sqlalchemy.exc.ProgrammingError: (psycopg2.ProgrammingError) can't adapt type 'dict'

Julien Marrec · Accepted Answer · 2017-01-05 10:59:45Z

60

I've been searching the web for a solution but couldn't find any so here is what we came up with (there might be better ways but at least this is a start if someone else runs into this).

Specify the dtype parameter in to_sql.

We went from:df.to_sql(table_name, analytics_db) to df.to_sql(table_name, analytics_db, dtype={'name_of_json_column_in_source_table': sqlalchemy.types.JSON}) and it just works.

edited Jan 5, 2017 at 10:59

Julien Marrec

11.9k5 gold badges51 silver badges67 bronze badges

answered Jan 4, 2017 at 17:03

peralmq

2,3562 gold badges23 silver badges21 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

Shriganesh Kolhe Over a year ago

dtype={'name_of_json_column_in_source_table': sqlalchemy.types.JSON} this also works if the source data is of type = jsonp

nonbeing Over a year ago

Setting dtype={'name_of_json_column_in_source_table': sqlalchemy.types.JSON} works even when writing a column of dicts in a pandas dataframe to a JSON column in MySQL 8.0.19. Thanks!

Raksha Over a year ago

I'm getting an error with my json column "scans": ValueError: scans (<class 'sqlalchemy.sql.sqltypes.JSON'>) not a string

rumbin · Accepted Answer · 2018-06-12 20:08:25Z

12

If you (re-)create the JSON column using json.dumps(), you're all set. This way the data can be written using pandas' .to_sql() method, but also the much faster COPY method of PostgreSQL (via copy_expert() of psycopg2 or sqlalchemy's raw_connection()) can be employed.

For the sake of simplicity, let's assume that we have a column of dictionaries that should be written into a JSON(B) column:

import json
import pandas as pd

df = pd.DataFrame([['row1',{'a':1, 'b':2}],
                   ['row2',{'a':3,'b':4,'c':'some text'}]],
                  columns=['r','kv'])

# conversion function:
def dict2json(dictionary):
    return json.dumps(dictionary, ensure_ascii=False)

# overwrite the dict column with json-strings
df['kv'] = df.kv.map(dict2json)

edited Jun 12, 2018 at 20:08

answered Jun 12, 2018 at 20:02

rumbin

1211 silver badge4 bronze badges

Comments

Matúš Košík · Accepted Answer · 2021-01-26 15:11:58Z

7

I am unable to comment peralmq's answer, but in case of postgresql JSONB you can use

from sqlalchemy import dialects
dataframe.to_sql(..., dtype={"json_column":dialects.postgresql.JSONB})

answered Jan 26, 2021 at 15:11

Matúš Košík

711 silver badge4 bronze badges

1 Comment

Raksha Over a year ago

i'm getting an error with my json column "scans": ValueError: scans (<class 'sqlalchemy.dialects.postgresql.json.JSONB'>) not a string

Collectives™ on Stack Overflow

Writing JSON column to Postgres using Pandas .to_sql

3 Answers 3

3 Comments

Comments

1 Comment

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

3 Comments

Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Linked

Related