Why is my postgreSQL index not being used?

Question

When I use the following query, the response time is really terrible(sometimes over a minute!).

select * from cdr where start_time < now() - interval '4 hours' and final = 0 limit 50

I am trying to get records where the final=0 and the starttime is over 4 hours old. the following is the index I have on the table:

CREATE INDEX "cdr_Final_ix"
ON cdr
USING btree
(start_time , final );

The following is the explain analyze:

"Limit  (cost=0.00..167.81 rows=50 width=188) (actual time=64491.409..64650.635 rows=11 loops=1)"
"  ->  Seq Scan on cdr  (cost=0.00..749671.06 rows=223372 width=188) (actual time=64491.407..64650.625 rows=11 loops=1)"
"Filter: ((final = 0) AND (start_time < (now() - '04:00:00'::interval)))"
"Total runtime: 64650.690 ms"

Any help would be Greatly appreciated. Thanks, Ari

How many rows ? How many qualify ? Note: you have no order by — wildplasser
– wildplasser, Commented Jul 30, 2012 at 8:09
It doesn't matter to me what the order is. The amount of rows in the table varies during the day between ~10K to 5million. the amount that qualifies varies between 0 and 200 because I update the final after I query it usually. If I am not up to date then there can qualify >100K. — Ari Pollack
– Ari Pollack, Commented Jul 30, 2012 at 8:30
I get index-scans < 50 ms for 40K rows, even with "bad" tuning. Your tuning constants? Your distribution of values? Vacuum analyze ? version ? NB: I am now increasing the number of rows. — wildplasser
– wildplasser, Commented Jul 30, 2012 at 8:59
im using version 9.1. I was able to successfully use the index when i stated "start_time = '7/29/2012'" but this won't help me for my query. When it compares an inequality it doesn't pick up the index. If my program that reads from postgres is working correctly, then the values where final = 0 should always be within 4 hours of the current date and just a little outside the 4 hour window. — Ari Pollack
– Ari Pollack, Commented Jul 30, 2012 at 9:12
With a partial index (WHERE final=0) I get sstill get index scans, results in sub-millisecond times, with normal index 100--200 ms, with about 2.5M rows UPDATE: I'll post as an answer. — wildplasser
– wildplasser, Commented Jul 30, 2012 at 9:14

wildplasser · Accepted Answer · 2012-07-30 09:21:35Z

5

-- DROP SCHEMA tmp CASCADE;
-- CREATE SCHEMA tmp ;
SET search_path='tmp';

-- Generate some data
CREATE TABLE cdr
        ( start_time TIMESTAMP NOT NULL
        , final INTEGER
        );
INSERT INTO cdr (start_time,final)
SELECT gs, random() * 1000
FROM generate_series('2012-07-01 00:00:00', '2012-08-01 00:00:00', '1 s'::interval) gs
        ;
DROP INDEX "cdr_Final_ix";
CREATE INDEX "cdr_Final_ix"
ON cdr
USING btree
(start_time , final )
WHERE final = 0 -- partial index here
;

-- Do some data-massaging
-- UPDATE cdr
-- SET final = random() * 100
-- WHERE final = 0
-- AND random() < 0.2 ;

VACUUM ANALYZE cdr;

-- SET tuning to default (the worst possible)
SET random_page_cost = 4;
SET work_mem = 64;
SET effective_cache_size = 64;
-- SET shared_buffers = 64;

EXPLAIN ANALYZE
SELECT * from cdr
WHERE start_time < now() - interval '4 hours'
AND final = 0
ORDER BY start_time
LIMIT 50
        ;

Result:

                                                           QUERY PLAN                                                            
----------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.01..88.49 rows=50 width=12) (actual time=0.191..0.452 rows=50 loops=1)
   ->  Index Scan using "cdr_Final_ix" on cdr  (cost=0.01..4310.95 rows=2436 width=12) (actual time=0.188..0.321 rows=50 loops=1)
         Index Cond: ((start_time < (now() - '04:00:00'::interval)) AND (final = 0))
 Total runtime: 0.569 ms
(4 rows)

answered Jul 30, 2012 at 9:21

wildplasser

44.5k9 gold badges72 silver badges116 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

Ari Pollack Over a year ago

ok. sounds good. i changed my index and i am vacuum analyzing now. what was the data massaging for and is it necessary because I don't want to change my data?

wildplasser Over a year ago

No, the data massaging was just to alter the distribution of "final" in my synthetic dataset. I also shifted the dates up to 2012-07-30 + one month, with no change in the resulting plan.

Ari Pollack Over a year ago

Awsome! it worked! its picking up the index. Thanks so much. I am getting a different query plan than you though. See next comment

Ari Pollack Over a year ago

"Limit (cost=204.54..391.71 rows=50 width=188) (actual time=0.469..0.533 rows=50 loops=1)" " -> Bitmap Heap Scan on cdr (cost=204.54..21298.88 rows=5635 width=188) (actual time=0.469..0.520 rows=50 loops=1)" " Recheck Cond: ((start_time < (now() - '04:00:00'::interval)) AND (final = 0))" " -> Bitmap Index Scan on "cdr_Final_ix" (cost=0.00..203.13 rows=5635 width=0) (actual time=0.394..0.394 rows=1194 loops=1)" " Index Cond: ((start_time < (now() - '04:00:00'::interval)) AND (final = 0))" "Total runtime: 0.576 ms" Any idea Why?

wildplasser Over a year ago

The reason why are the tunables {random_page_cost,work_mem,effective_cache_size,shared_buffers} plus maybe the distribution of the values (or its estimate) YMMV...

Collectives™ on Stack Overflow

Why is my postgreSQL index not being used?

1 Answer 1

5 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

5 Comments

Your Answer

Sign up or log in

Post as a guest

Related