Postgres not using index over a certain limit of rows?

Question

Having a test table like this:

testdb=> \d test_table
                                             Table "public.test_table"
           Column           |            Type             |                         Modifiers
----------------------------+-----------------------------+-----------------------------------------------------------
 id                         | integer                     | not null default nextval('test_table_id_seq'::regclass)
 phone                      | character varying(15)       | not null
 comment                    | text                        |
 timestamp                  | timestamp without time zone | not null
Indexes:
    "test_table_pkey" PRIMARY KEY, btree (id)
    "i_test_substr" btree ("substring"(phone::text, 1, 1), "timestamp" DESC)
Triggers:

With an index like this:

testdb=> create index i_test_substr ON test_table (substring(phone, 1, 1), timestamp desc);

These queries will use the index to sort under a limit of 203 rows, but will sort in-memory above 204 rows.

Can anyone explain this behaviour?

testdb=> explain analyze select * from test_table where substring(phone,1 ,1) = '2' order by timestamp desc limit 203;
                                                                QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.42..709.52 rows=203 width=202) (actual time=0.043..0.194 rows=203 loops=1)
   ->  Index Scan using i_test_substr on test_table  (cost=0.42..1146.16 rows=328 width=202) (actual time=0.041..0.170 rows=203 loops=1)
         Index Cond: ("substring"((phone)::text, 1, 1) = '2'::text)
 Total runtime: 0.249 ms
(4 rows)



testdb=> explain analyze select * from test_table where substring(phone,1 ,1) = '2' order by timestamp desc limit 204;
                                                               QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=711.74..712.25 rows=204 width=202) (actual time=7.655..7.681 rows=204 loops=1)
   ->  Sort  (cost=711.74..712.56 rows=328 width=202) (actual time=7.653..7.664 rows=204 loops=1)
         Sort Key: "timestamp"
         Sort Method: top-N heapsort  Memory: 53kB
         ->  Bitmap Heap Scan on test_table  (cost=10.96..698.03 rows=328 width=202) (actual time=1.340..5.010 rows=11514 loops=1)
               Recheck Cond: ("substring"((phone)::text, 1, 1) = '2'::text)
               ->  Bitmap Index Scan on i_test_substr  (cost=0.00..10.88 rows=328 width=0) (actual time=1.217..1.217 rows=11514 loops=1)
                     Index Cond: ("substring"((phone)::text, 1, 1) = '2'::text)
 Total runtime: 7.746 ms
(9 rows)

Please attach granularity of your table and query row count without limit + explain analyze without limit — Kamil Gosciminski
– Kamil Gosciminski, Commented Feb 26, 2016 at 15:24
That's how the optimizer work. See Bruce Momjan slides about this at google.fr/… — greg
– greg, Commented Feb 26, 2016 at 15:52

Ildar Musin · Accepted Answer · 2016-02-26 17:55:54Z

1

In short it gets to expensive to use index scan in case when there are too many rows to return. Index scan will load disk pages for each row in order they appear in index. Thereby some pages will be loaded more than once (probably many times). Bitmap index scan first makes kind of list of disk pages and only then loads them once.

So optimizer calculates the costs of each possible plan and decides which one is the cheapest.

answered Feb 26, 2016 at 17:55

Ildar Musin

1,4881 gold badge12 silver badges11 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Postgres not using index over a certain limit of rows?

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related