1

Having a test table like this:

testdb=> \d test_table
                                             Table "public.test_table"
           Column           |            Type             |                         Modifiers
----------------------------+-----------------------------+-----------------------------------------------------------
 id                         | integer                     | not null default nextval('test_table_id_seq'::regclass)
 phone                      | character varying(15)       | not null
 comment                    | text                        |
 timestamp                  | timestamp without time zone | not null
Indexes:
    "test_table_pkey" PRIMARY KEY, btree (id)
    "i_test_substr" btree ("substring"(phone::text, 1, 1), "timestamp" DESC)
Triggers:

With an index like this:

testdb=> create index i_test_substr ON test_table (substring(phone, 1, 1), timestamp desc);

These queries will use the index to sort under a limit of 203 rows, but will sort in-memory above 204 rows.

Can anyone explain this behaviour?

testdb=> explain analyze select * from test_table where substring(phone,1 ,1) = '2' order by timestamp desc limit 203;
                                                                QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.42..709.52 rows=203 width=202) (actual time=0.043..0.194 rows=203 loops=1)
   ->  Index Scan using i_test_substr on test_table  (cost=0.42..1146.16 rows=328 width=202) (actual time=0.041..0.170 rows=203 loops=1)
         Index Cond: ("substring"((phone)::text, 1, 1) = '2'::text)
 Total runtime: 0.249 ms
(4 rows)



testdb=> explain analyze select * from test_table where substring(phone,1 ,1) = '2' order by timestamp desc limit 204;
                                                               QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=711.74..712.25 rows=204 width=202) (actual time=7.655..7.681 rows=204 loops=1)
   ->  Sort  (cost=711.74..712.56 rows=328 width=202) (actual time=7.653..7.664 rows=204 loops=1)
         Sort Key: "timestamp"
         Sort Method: top-N heapsort  Memory: 53kB
         ->  Bitmap Heap Scan on test_table  (cost=10.96..698.03 rows=328 width=202) (actual time=1.340..5.010 rows=11514 loops=1)
               Recheck Cond: ("substring"((phone)::text, 1, 1) = '2'::text)
               ->  Bitmap Index Scan on i_test_substr  (cost=0.00..10.88 rows=328 width=0) (actual time=1.217..1.217 rows=11514 loops=1)
                     Index Cond: ("substring"((phone)::text, 1, 1) = '2'::text)
 Total runtime: 7.746 ms
(9 rows)
3
  • Please attach granularity of your table and query row count without limit + explain analyze without limit Commented Feb 26, 2016 at 15:24
  • vacuum, analyse, reindex didn't help? Commented Feb 26, 2016 at 15:33
  • That's how the optimizer work. See Bruce Momjan slides about this at google.fr/… Commented Feb 26, 2016 at 15:52

1 Answer 1

1

In short it gets to expensive to use index scan in case when there are too many rows to return. Index scan will load disk pages for each row in order they appear in index. Thereby some pages will be loaded more than once (probably many times). Bitmap index scan first makes kind of list of disk pages and only then loads them once.

So optimizer calculates the costs of each possible plan and decides which one is the cheapest.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.