Returning JSON from Postgres is slow

Question

I have a table in Postgres with a JSONB column, each row of the table contains a large JSONB object (~4500 keys, JSON string is around 110 KB in a txt file). I want to query these rows and get the entire JSONB object.

The query is fast -- when I run EXPLAIN ANALYZE, or omit the JSONB column, it returns in 100-300 ms. But when I execute the full query, it takes on the order of minutes. The exact same query on a previous version of the data was also fast (each JSONB was about half as large).

Some notes:

This ends up in Python (via SQLAlchemy/psycopg2). I'm worried that the query executor is converting JSONB to JSON, then it gets encoded to text for transfer over the wire, then gets JSON encoded again on the Python end. Is this correct? If so how could I mitigate this issue? When I select the JSONB column as ::text, the query is roughly twice as fast.
I only need a small subset of the JSON (around 300 keys or 6% of keys). I tried methods of filtering the JSON output in the query but they caused a substantial further performance hit -- it ended up being faster to return the entire object.

First try creating and running the query in pure sql, and run it on the same node as the postgres sever (if possible) this will eliminate any sqlalchemy/network issues. Next, provide more details about the data and the actual sql that you ran. and the query timings. — Jon Scott
– Jon Scott, Commented Sep 30, 2017 at 7:14
@JonScott when you say rewrite the query in pure SQL, does this mean I should essentially remove any PostgreSQL specific features (like JSON)? — glifchits
– glifchits, Commented Oct 2, 2017 at 15:05

glifchits · Accepted Answer · 2017-10-02 15:11:17Z

8

This is not necessarily a solution, but here is an update:

By casting the JSON column to text in the Postgres query, I was able to substantially cut down query execution and results fetching on the Python end.

On the Python end, doing json.loads for every single row in the result set brings me to the exact timing as using the regular query. However, with the ujson library I was able to obtain a significant speedup. The performance of casting to text in the query, then calling ujson.loads on the python end is roughly 3x faster than simply returning JSON in the query.

answered Oct 2, 2017 at 15:11

glifchits

7138 silver badges26 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Returning JSON from Postgres is slow

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related