PostgreSQL 14.12 on Linux.
I am encountering a weird situation where the same query gets 250x faster when minimally arranging my query into a subquery.
This is even though the query plans are pretty much identical; concretely, the involved Nested Loop (nodes and everything below them) are completely identical, yet their execution times are vastly different:
time=32.181..19092.418 vs time=0.035..2.267
even though emitted-rows counts and loop counts are equal.
I have thoroughly read https://www.postgresql.org/docs/17/using-explain.html but can't make sense of the difference.
Data model
Each user (table users) has some houses. Each house contains thousands of screws, tracked in jsonb_description containing { "screws": [ { "screw_location": null }, { "screw_location": "under the bathtub" } ] }.
I want to count for some users' houses how many screws are in them with known (non-null) locations.
Slow query
EXPLAIN ANALYZE
SELECT
(SELECT name FROM users WHERE id = houses.owner) as owner_name,
jsonb_array_length(jsonb_path_query_array(jsonb_description, '$.screws.keyvalue() ? (@.value.screw_location <> null)')) AS used_screw_count
FROM houses
WHERE owner IN (SELECT id FROM users WHERE name IN ( 'a', 'b', 'd', 'e', 'f' ) )
ORDER BY construction_date
LIMIT 1;
Limit (cost=75.91..78.43 rows=1 width=52) (actual time=19095.625..19095.626 rows=1 loops=1)
-> Result (cost=75.91..131.40 rows=22 width=52) (actual time=19095.624..19095.625 rows=1 loops=1)
-> Sort (cost=75.91..75.96 rows=22 width=36) (actual time=19095.591..19095.592 rows=1 loops=1)
Sort Key: houses."construction_date"
Sort Method: top-N heapsort Memory: 25kB
-> Nested Loop (cost=0.57..75.80 rows=22 width=36) (actual time=32.181..19092.418 rows=1674 loops=1)
-> Index Scan using users_name_key on users (cost=0.28..12.48 rows=5 width=16) (actual time=0.013..0.083 rows=5 loops=1)
Index Cond: (name = ANY ('{a,b,d,e,f}'::text[]))
-> Index Scan using houses_owner_ix on houses (cost=0.29..12.51 rows=10 width=391) (actual time=0.007..1.757 rows=335 loops=5)
Index Cond: (owner = users.id)
SubPlan 1
-> Index Scan using users_pkey on users users_1 (cost=0.28..2.50 rows=1 width=23) (actual time=0.026..0.026 rows=1 loops=1)
Index Cond: (id = houses.owner)
Planning Time: 0.224 ms
Execution Time: 19095.653 ms
Fast query
EXPLAIN ANALYZE
SELECT
(SELECT name FROM users WHERE id = houses.owner) as owner_name,
jsonb_array_length(jsonb_path_query_array(jsonb_description, '$.screws.keyvalue() ? (@.value.screw_location <> null)')) AS used_screw_count
FROM
(
SELECT * FROM houses
WHERE owner IN (SELECT id FROM users WHERE name IN ( 'a', 'b', 'd', 'e', 'f' ) )
ORDER BY construction_date
) AS houses
LIMIT 1;
Limit (cost=76.01..78.53 rows=1 width=44) (actual time=72.206..72.210 rows=1 loops=1)
-> Subquery Scan on houses (cost=76.01..131.51 rows=22 width=44) (actual time=72.205..72.207 rows=1 loops=1)
-> Sort (cost=76.01..76.07 rows=22 width=1011) (actual time=3.363..3.365 rows=1 loops=1)
Sort Key: houses_1."construction_date"
Sort Method: top-N heapsort Memory: 25kB
-> Nested Loop (cost=0.57..75.52 rows=22 width=1011) (actual time=0.035..2.267 rows=1674 loops=1)
-> Index Scan using users_name_key on users (cost=0.28..12.48 rows=5 width=16) (actual time=0.021..0.045 rows=5 loops=1)
Index Cond: (name = ANY ('{a,b,d,e,f}'::text[]))
-> Index Scan using houses_owner_ix on houses houses_1 (cost=0.29..12.51 rows=10 width=391) (actual time=0.005..0.217 rows=335 loops=5)
Index Cond: (owner = users.id)
SubPlan 1
-> Index Scan using users_pkey on users users_1 (cost=0.28..2.50 rows=1 width=23) (actual time=0.004..0.005 rows=1 loops=1)
Index Cond: (id = houses.owner)
Planning Time: 0.464 ms
Execution Time: 76.245 ms
Why is this query 250x faster?
I already suspect a different application of LIMIT 1 (e.g. that the slow query forces the computation of the full result set before applying `LIMIT 1), but:
- Should this not show up as a query plan difference, instead of them being identical?
- Why does the subquery have this effect?
- Why are the
Nested Looptimes drastically different? - What tools can I use to analyse this difference?
Thanks!
EXPLAIN (ANALYZE, BUFFERS), preferably after making sure track_io_timing is on.SELECT u.name as owner_name, jsonb_array_length(jsonb_path_query_array(jsonb_description, '$.screws.keyvalue() ? (@.value.screw_location <> null)')) AS used_screw_count FROM houses h JOIN users u ON u.id = h.owner WHERE u.name IN ( 'a', 'b', 'd', 'e', 'f' ) ORDER BY construction_date LIMIT 1;Also it seems your indexes aren't covering, please show the full index definitions for both tables.