Postgres query slow with many columns

Question

I have a query (executed against a Postgres (9.6.10) database using psql) that returns 10 rows. The query executes 20x slower when selecting 30 columns rather than 1.

I think I know why this is happening (see EXPLAIN Output below). And I'm guessing the workaround is to select only the id and then rejoin on the data. Does this suggest a bug in the query planner? Any other workaround?

Query 1 (executes in 20 seconds)

EXPLAIN ANALYZE SELECT fundraisers.*
FROM fundraisers 
INNER JOIN audit_logs ON audit_logs.fundraiser_id = fundraisers.id 
LEFT OUTER JOIN accounts ON accounts.id = fundraisers.account_id 
GROUP BY accounts.id, fundraisers.id  
LIMIT 10

Query 2 (executes in 1 second)

Differs only in the selected columns

EXPLAIN ANALYZE SELECT fundraisers.id
FROM fundraisers 
INNER JOIN audit_logs ON audit_logs.fundraiser_id = fundraisers.id 
LEFT OUTER JOIN accounts ON accounts.id = fundraisers.account_id 
GROUP BY accounts.id, fundraisers.id  
LIMIT 10

EXPLAIN Output

One thing I notice is that in the EXPLAIN output I see that the Hash Join has a different cost due to the width of the data being joined. ie.

->  Hash Join  (cost=25967.06..109216.83 rows=1359646 width=1634) (actual time=322.987..1971.464 rows=1356192 loops=1)

vs

->  Hash Join  (cost=14500.06..74422.83 rows=1359646 width=8) (actual time=111.710..730.736 rows=1356192 loops=1)

More Details

database=# EXPLAIN ANALYZE SELECT fundraisers.*
database-# FROM fundraisers 
database-# INNER JOIN audit_logs ON audit_logs.fundraiser_id = fundraisers.id 
database-# LEFT OUTER JOIN accounts ON accounts.id = fundraisers.account_id 
database-# GROUP BY accounts.id, fundraisers.id  
database-# LIMIT 10;
                                                                       QUERY PLAN                                                                        
---------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=3147608.91..3147608.98 rows=10 width=1634) (actual time=20437.137..20437.190 rows=10 loops=1)
   ->  Group  (cost=3147608.91..3157806.25 rows=1359646 width=1634) (actual time=20437.136..20437.186 rows=10 loops=1)
         Group Key: accounts.id, fundraisers.id
         ->  Sort  (cost=3147608.91..3151008.02 rows=1359646 width=1634) (actual time=20437.133..20437.165 rows=120 loops=1)
               Sort Key: accounts.id, fundraisers.id
               Sort Method: external merge  Disk: 1976192kB
               ->  Hash Join  (cost=25967.06..109216.83 rows=1359646 width=1634) (actual time=322.987..1971.464 rows=1356192 loops=1)
                     Hash Cond: (audit_logs.fundraiser_id = fundraisers.id)
                     ->  Seq Scan on audit_logs  (cost=0.00..40634.14 rows=1517914 width=4) (actual time=0.078..324.638 rows=1517915 loops=1)
                     ->  Hash  (cost=13794.41..13794.41 rows=56452 width=1634) (actual time=321.869..321.869 rows=56452 loops=1)
                           Buckets: 4096  Batches: 32  Memory Usage: 2786kB
                           ->  Hash Left Join  (cost=1548.76..13794.41 rows=56452 width=1634) (actual time=16.465..122.406 rows=56452 loops=1)
                                 Hash Cond: (fundraisers.account_id = accounts.id)
                                 ->  Seq Scan on fundraisers  (cost=0.00..11546.52 rows=56452 width=1630) (actual time=0.068..54.434 rows=56452 loops=1)
                                 ->  Hash  (cost=965.56..965.56 rows=46656 width=4) (actual time=16.337..16.337 rows=46656 loops=1)
                                       Buckets: 65536  Batches: 1  Memory Usage: 2153kB
                                       ->  Seq Scan on accounts  (cost=0.00..965.56 rows=46656 width=4) (actual time=0.020..8.268 rows=46656 loops=1)

 Planning time: 0.748 ms
 Execution time: 21013.427 ms
(19 rows)

database=# EXPLAIN ANALYZE SELECT fundraisers.id
database-# FROM fundraisers 
database-# INNER JOIN audit_logs ON audit_logs.fundraiser_id = fundraisers.id 
database-# LEFT OUTER JOIN accounts ON accounts.id = fundraisers.account_id 
database-# GROUP BY accounts.id, fundraisers.id  
database-# LIMIT 10;
                                                                      QUERY PLAN                                                                      
------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=231527.41..231527.48 rows=10 width=8) (actual time=1314.884..1314.917 rows=10 loops=1)
   ->  Group  (cost=231527.41..241724.75 rows=1359646 width=8) (actual time=1314.884..1314.914 rows=10 loops=1)
         Group Key: accounts.id, fundraisers.id
         ->  Sort  (cost=231527.41..234926.52 rows=1359646 width=8) (actual time=1314.883..1314.901 rows=120 loops=1)
               Sort Key: accounts.id, fundraisers.id
               Sort Method: external merge  Disk: 23840kB
               ->  Hash Join  (cost=14500.06..74422.83 rows=1359646 width=8) (actual time=111.710..730.736 rows=1356192 loops=1)
                     Hash Cond: (audit_logs.fundraiser_id = fundraisers.id)
                     ->  Seq Scan on audit_logs  (cost=0.00..40634.14 rows=1517914 width=4) (actual time=0.062..224.307 rows=1517915 loops=1)
                     ->  Hash  (cost=13794.41..13794.41 rows=56452 width=8) (actual time=111.566..111.566 rows=56452 loops=1)
                           Buckets: 65536  Batches: 1  Memory Usage: 2687kB
                           ->  Hash Left Join  (cost=1548.76..13794.41 rows=56452 width=8) (actual time=17.362..98.257 rows=56452 loops=1)
                                 Hash Cond: (fundraisers.account_id = accounts.id)
                                 ->  Seq Scan on fundraisers  (cost=0.00..11546.52 rows=56452 width=8) (actual time=0.067..54.676 rows=56452 loops=1)
                                 ->  Hash  (cost=965.56..965.56 rows=46656 width=4) (actual time=16.524..16.524 rows=46656 loops=1)
                                       Buckets: 65536  Batches: 1  Memory Usage: 2153kB
                                       ->  Seq Scan on accounts  (cost=0.00..965.56 rows=46656 width=4) (actual time=0.032..7.804 rows=46656 loops=1)
 Planning time: 0.469 ms
 Execution time: 1323.349 ms

In the one column case, the accounts table isn't relevant to the returned set. Does the full EXPLAIN show any use of it? The large number of sequential scans makes we wonder if you have indexed the foreign key in accounts and fundraisers. — Andrew Lazarus
– Andrew Lazarus, Commented Jan 26, 2019 at 1:24
@lau - The reason for GROUP BY is to get distinct fundraisers. (Not standard, I believe, but works in postgres). — Felix Livni
– Felix Livni, Commented Jan 26, 2019 at 2:10
@AndrewLazarus Not my intention at all. Sorry if I wasted your time! Background: I was trying to improve something that was initially a real-world problem. I saw what I thought was odd behavior. I removed all the practical bits to end up with the simplest version of the problem that still would exhibit the same behavior and posted. So, it became a question not of "how do I improve the performance of my query" but rather "why do I see this odd behavior"? Sorry if it appears devoid of practical application. — Felix Livni
– Felix Livni, Commented Jan 27, 2019 at 9:01
I'm glad you asked this question. I had the same issue, a query took ~20s longer when I included long character fields in my select statement. When I used a subselect with one column ("rejoin" as you write) it was suddenly a lot faster. — Robert Hawkins
– Robert Hawkins, Commented Oct 10, 2020 at 4:25

FXD · Accepted Answer · 2019-01-26 02:44:52Z

0

What you are missing in your analysis is the cost of the sort.

The sequence of things that is happening is:

Select data from tables + JOIN (reasonably expensive)
Sort data, in preparation for the GROUP BY.
GROUP BY (cheap thanks to the sort) + LIMIT as requested.

I cannot get my hands on documentation for this sort so I will assume it works like another DBMS I know a little: Oracle.

As explained here, the server will sometimes need to use the hard disk drive for this operation.
It is a very slow operation.

Most probably, this is what happens with your queries, the difference being that postgresql will have either 1 field (=1 second total execution) or many (=20 seconds total execution) to write.

With all that being said, keep in mind that you are only working with a test query, probably equivalent to SELECT * FROM fundraisers LIMIT 10 (based on fields' names, I do not have the tables' definition to be sure).

I am not too shocked that with such a gap between what you want (= production query) and what you type (= test query), the DB behave a bit funny.

answered Jan 26, 2019 at 2:44

FXD

2,0901 gold badge10 silver badges10 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Felix Livni Over a year ago

I'm interested in why the relative performance between query 1 and query 2 are different seeing that they only differ by the number of columns. My guess is that with a different query plan, the performance would be the same, given that there are only 10 records being returned. That is, a different query plan could do essentially what I mention as the workaround: to select just one column and then rejoin.

wildplasser · Accepted Answer · 2019-01-26 15:24:11Z

0

first:

tables without keys have no meaning (this is a consequence of the second normal form)
(the result of) a query on such table has no meaning
without any structure (PK,FK,secondary indexes), the optimizer has only two options: nested loops(on seqscans) or hashjoins
hashjoins are always a good choice, given enough memory
the final ORDER BY orGROUP BY needs a sort step (the result of a hashjoin has no implicit order) over the complete resultset (just for finding the top 10 results)
if a hash table grows too large(larger thanWORK_MEM), it will spill to disk
more columns need more space, even in the hashtables, so they will exceed WORK_MEM sooner, and spill to disk

Finally: it makes no sense to benchmark and compare nonsensical queries. The whole optimizer machinery assumes sane data models. Without this it just produces something that works.

edited Jan 26, 2019 at 15:24

answered Jan 26, 2019 at 14:47

wildplasser

44.5k9 gold badges72 silver badges116 bronze badges

1 Comment

Andrew Lazarus Over a year ago

See also my new comment on the question.

Collectives™ on Stack Overflow

Postgres query slow with many columns

2 Answers 2

1 Comment

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related