Postgres slow limit query

Question

I have this simple query in pg

EXPLAIN ANALYZE 
select * from email_events 
where act_owner_id = 500
order by date desc
limit 500

The first query execution take very long time about 7 seconds.

"Limit  (cost=0.43..8792.83 rows=500 width=2311) (actual time=3.064..7282.497 rows=500 loops=1)"
"  ->  Index Scan Backward using email_events_idx_date on email_events  (cost=0.43..233667.36 rows=13288 width=2311) (actual time=3.059..7282.094 rows=500 loops=1)"
"        Filter: (act_owner_id = 500)"
"        Rows Removed by Filter: 1053020"
"Total runtime: 7282.818 ms"

After the first execution the query i guess is cached and goes in 20-30 ms.

Why the LIMIT is so slow when there is no cache? How can i fix this?

Yes i do, even if is order by act_owner_id the result is the same — Vasil Atanasov
– Vasil Atanasov, Commented Feb 21, 2014 at 20:06
Try a composite index on act_owner_id + date (in this order). — krokodilko
– krokodilko, Commented Feb 21, 2014 at 20:12
How huge is your database aprox? How looks your load on the system? (Thinking of maybe a real huge database with small resources is maybe lagging on I/O on first attempt) — frlan
– frlan, Commented Feb 21, 2014 at 21:04

Vasil Atanasov · Accepted Answer · 2014-02-22 14:44:35Z

2

CLUSTER TABLE on INDEX seems to fix the problem. It seems that after bulk data loading that data is all over the hard drive. CLUSTER table will re-order the data on the hard drive

answered Feb 22, 2014 at 14:44

Vasil Atanasov

2271 gold badge4 silver badges12 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Craig Ringer Over a year ago

Nice to see CLUSTERing on an index being useful for once.

Craig Ringer · Accepted Answer · 2014-02-22 03:41:16Z

1

PostgreSQL thinks it will be faster to scan the date-ordered index backwards (i.e. in DESC order), reading every row and throwing away the rows that don't have the right act_owner_id. It's having to do 1053020 random reads to do this, and backward index scans aren't very fast either.

Try creating an index on email_events(date DESC, act_owner_id). I think Pg will be able to do a forward index scan on that and then use the second index term to filter rows, so it shouldn't have to do a heap lookup. Test with EXPLAIN and see.

answered Feb 22, 2014 at 3:41

Craig Ringer

329k84 gold badges742 silver badges820 bronze badges

1 Comment

Vasil Atanasov Over a year ago

Not helping... The DB was filled with dummy data generated from me, loaded table by table in bulk...I made CLUSTER on one smaller table, and the problem seems to be fixed for that table. After cleared cache the first time the query is executed in less than 500ms, before CLUSTER used to go up to 20 sec.. will do CLUSTER on the biggest table to see if that will fix the problem

Collectives™ on Stack Overflow

Postgres slow limit query

2 Answers 2

1 Comment

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related