Postgres not using index with varchar_pattern_ops for pattern matching query

Question

I have a query in PostgresSQL accessing a big table using a LIKE clause for pattern matching:

                                                    Table "rmx_service_schema.document"
        Column         |            Type             | Collation | Nullable | Default | Storage  | Compression | Stats target | Description 
-----------------------+-----------------------------+-----------+----------+---------+----------+-------------+--------------+-------------
 id                    | character varying(36)       |           | not null |         | extended |             |              | 
 file_name             | character varying(512)      |           | not null |         | extended |             |              | 
...

The query has very good selectivity:

select count(*) from RMX_SERVICE_SCHEMA.DOCUMENT d1_0;
 count  
--------
 630015


select count(*) from RMX_SERVICE_SCHEMA.DOCUMENT d1_0 where d1_0.FILE_NAME LIKE 'sunet_attachments/20240207.xml';
 count 
-------
     1

The application somtimes uses % at the end of the pattern, so replacing the LIKE by = is not always possible

I have created an index on that column with the matching operator definition:

CREATE INDEX rse_tmp_doc_file_name ON RMX_SERVICE_SCHEMA.DOCUMENT (file_name varchar_pattern_ops);

But still, the pattern matching query does a Seq Scan:

EXPLAIN ANALYZE select id from RMX_SERVICE_SCHEMA.DOCUMENT d1_0 where d1_0.FILE_NAME LIKE 'sunet_attachments/2024020.xml';

                                                          QUERY PLAN                                                           
-------------------------------------------------------------------------------------------------------------------------------
 Gather  (cost=1000.00..129562.16 rows=63 width=37) (actual time=81.075..90.793 rows=1 loops=1)
   Workers Planned: 4
   Workers Launched: 4
   ->  Parallel Seq Scan on document d1_0  (cost=0.00..128555.86 rows=16 width=37) (actual time=72.099..77.022 rows=0 loops=5)
         Filter: ((file_name)::text ~~ 'sunet_attachments/20240207.xml'::text)
         Rows Removed by Filter: 126007
 Planning Time: 0.285 ms
 Execution Time: 90.814 ms
(8 rows)

If I replace the LIKE by =, it uses the index:

EXPLAIN ANALYZE select id from RMX_SERVICE_SCHEMA.DOCUMENT d1_0 where d1_0.FILE_NAME ='sunet_attachments/20240207.xml';
                                                              QUERY PLAN                                                              
--------------------------------------------------------------------------------------------------------------------------------------
 Index Scan using rse_tmp_doc_file_name on document d1_0  (cost=0.55..8.57 rows=1 width=37) (actual time=0.025..0.026 rows=1 loops=1)
   Index Cond: ((file_name)::text = 'sunet_attachments/20240207.xml'::text)
 Planning Time: 0.053 ms
 Execution Time: 0.034 ms
(4 rows)

Did I miss some stpes required to make this btree index usable for pattern matching query?

Indexes:
    "pk_document" PRIMARY KEY, btree (id)
     ....
    "rse_tmp_doc_file_name" btree (file_name varchar_pattern_ops)

I was expecting the index I created is used for pattern matching, too, as long as selectivity is good and the pattern doesn't start by wildcards.

I have tried SET enable_seqscan=off, as suggested. The plan changed, but is still very slow:

EXPLAIN ANALYZE select id from RMX_SERVICE_SCHEMA.DOCUMENT d1_0 where d1_0.FILE_NAME LIKE 'sunet_attachments/20240207.xml';
                                                                      QUERY PLAN                                                                      
------------------------------------------------------------------------------------------------------------------------------------------------------
 Gather  (cost=31133.85..158837.55 rows=63 width=37) (actual time=300.945..314.717 rows=1 loops=1)
   Workers Planned: 4
   Workers Launched: 4
   ->  Parallel Bitmap Heap Scan on document d1_0  (cost=30133.85..157831.25 rows=16 width=37) (actual time=290.728..297.328 rows=0 loops=5)
         Filter: ((file_name)::text ~~ 'sunet_attachments/20240207.xml'::text)
         Rows Removed by Filter: 71233
         Heap Blocks: exact=19555
         ->  Bitmap Index Scan on rse_tmp_doc_file_name  (cost=0.00..30133.83 rows=355328 width=0) (actual time=149.426..149.426 rows=356167 loops=1)
               Index Cond: (((file_name)::text ~>=~ 'sunet'::text) AND ((file_name)::text ~<~ 'suneu'::text))
 Planning Time: 0.176 ms
 Execution Time: 314.747 ms
(11 rows)

But this plan gave me the right hint. The problem is the _ after the string sunet. This has to be escaped, otherwise it isn't selective, since about 50% of the file_name values in the table start with sunet. With correct escaping in the SQL, the index works:

EXPLAIN ANALYZE select id from RMX_SERVICE_SCHEMA.DOCUMENT d1_0 where d1_0.FILE_NAME LIKE 'sunet\_attachments/20240207.xml' escape '\';

                                                                             QUERY PLAN                                                                             
--------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Index Scan using rse_tmp_doc_file_name on document d1_0  (cost=0.55..8.57 rows=63 width=37) (actual time=0.014..0.015 rows=1 loops=1)
   Index Cond: (((file_name)::text ~>=~ 'sunet_attachments/20240207_10111647337'::text) AND ((file_name)::text ~<~ 'sunet_attachments/20240207'::text))
   Filter: ((file_name)::text ~~ 'sunet\_attachments/20240207.xml'::text)
 Planning Time: 0.152 ms
 Execution Time: 0.024 ms
(5 rows)

The question suggested it related to pg_trgm indexes, my question is related to btree indexes. — Pulsedriver
– Pulsedriver, Commented Sep 10 at 15:04
Can you run SET enable_seqscan = off;, then run the EXPLAIN ANALYZE with LIKE again and add the result to the question? — Laurenz Albe
– Laurenz Albe, Commented Sep 10 at 15:10
@Laurenz Albe Thanks a lot, that was the right hint. I have to escape the _ in my search pattern. Now it works. — Pulsedriver
– Pulsedriver, Commented Sep 10 at 15:32
Great. Rather than adding the solution to the question, you could write an answer to your own question. I for one would be happy to upvote it. — Laurenz Albe
– Laurenz Albe, Commented Sep 10 at 15:39

Richard Huxton · Accepted Answer · 2025-09-10 15:32:03Z

1

The trailing '%' isn't what is stopping the index from working, it is the '_' in the directory-name. That matches any character. You are going to have to escape underscores.

https://www.postgresql.org/docs/current/functions-matching.html#FUNCTIONS-LIKE

answered Sep 10 at 15:32

Richard Huxton

23.6k5 gold badges43 silver badges54 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Laurenz Albe Sep 10 at 15:37

That underscore wildcard isn't preventing PostgreSQL from using the index. But there may be so many rows matching the prefix that PostgreSQL estimates a sequential scan to be cheaper.

Pulsedriver Sep 10 at 15:41

This is true. Unfortunately, about 50% of the values have the prefix sunet. With escaping the _ it works fine now. Thanks a lot to all!

Collectives™ on Stack Overflow

Postgres not using index with varchar_pattern_ops for pattern matching query

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related