1

A table raw_data has an index ix_raw_data_timestamp:

CREATE TABLE IF NOT EXISTS public.raw_data
(
    ts timestamp without time zone NOT NULL,
    log_msg character varying COLLATE pg_catalog."default",
    log_image bytea
)
CREATE INDEX IF NOT EXISTS ix_raw_data_timestamp
    ON public.raw_data USING btree
    (ts ASC NULLS LAST)
    TABLESPACE pg_default;

For some reason the index is not used for the following query (and therefore is very slow):

SELECT ts,
    log_msg
FROM raw_data
ORDER BY ts ASC
LIMIT 5e6;

The result of EXPLAIN (analyze, buffers, format text) for the query above:

                                                                       QUERY PLAN                                                                       
--------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=9752787.07..10336161.14 rows=5000000 width=50) (actual time=789124.600..859046.614 rows=5000000 loops=1)
   Buffers: shared hit=12234 read=888521, temp read=2039471 written=2664654
   ->  Gather Merge  (cost=9752787.07..18421031.89 rows=74294054 width=50) (actual time=789085.442..822547.099 rows=5000000 loops=1)
         Workers Planned: 2
         Workers Launched: 2
         Buffers: shared hit=12234 read=888521, temp read=2039471 written=2664654
         ->  Sort  (cost=9751787.05..9844654.62 rows=37147027 width=50) (actual time=788203.880..795491.054 rows=1667070 loops=3)
               Sort Key: "ts"
               Sort Method: external merge  Disk: 1758904kB
               Worker 0:  Sort Method: external merge  Disk: 1762872kB
               Worker 1:  Sort Method: external merge  Disk: 1756216kB
               Buffers: shared hit=12234 read=888521, temp read=2039471 written=2664654
               ->  Parallel Seq Scan on raw_data  (cost=0.00..1272131.27 rows=37147027 width=50) (actual time=25.436..119352.861 rows=29717641 loops=3)
                     Buffers: shared hit=12141 read=888520
 Planning Time: 5.240 ms
 JIT:
   Functions: 7
   Options: Inlining true, Optimization true, Expressions true, Deforming true
   Timing: Generation 0.578 ms, Inlining 76.678 ms, Optimization 24.578 ms, Emission 13.060 ms, Total 114.894 ms
 Execution Time: 877489.531 ms
(20 rows)

But it is used for this one:

SELECT ts,
    log_msg
FROM raw_data
ORDER BY ts ASC
LIMIT 4e6;

EXPLAIN (analyze, buffers, format text) of the query above is:

                                                                           QUERY PLAN                                                                           
----------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.57..9408157.15 rows=4000000 width=50) (actual time=15.081..44747.127 rows=4000000 loops=1)
   Buffers: shared hit=24775 read=61155
   ->  Index Scan using ix_raw_data_timestamp on raw_data  (cost=0.57..209691026.73 rows=89152864 width=50) (actual time=2.218..16077.755 rows=4000000 loops=1)
         Buffers: shared hit=24775 read=61155
 Planning Time: 1.306 ms
 JIT:
   Functions: 3
   Options: Inlining true, Optimization true, Expressions true, Deforming true
   Timing: Generation 0.406 ms, Inlining 1.121 ms, Optimization 7.917 ms, Emission 3.721 ms, Total 13.165 ms
 Execution Time: 59028.951 ms
(10 rows)

Needless to say that the aim is to get all queries to use the index no matter the size, but I cannot seem to find a solution.

PS:

  • There's about 89152922 rows in the database.

Edit:

After increasing the memory to 2G (SET work_mem = '2GB';), the query is a little faster (doesn't use disk anymore) but still nowhere as fast:

                                                                       QUERY PLAN                                                                       
--------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=5592250.54..6175624.61 rows=5000000 width=50) (actual time=215887.445..282393.743 rows=5000000 loops=1)
   Buffers: shared hit=12224 read=888531
   ->  Gather Merge  (cost=5592250.54..14260731.75 rows=74296080 width=50) (actual time=215874.072..247030.062 rows=5000000 loops=1)
         Workers Planned: 2
         Workers Launched: 2
         Buffers: shared hit=12224 read=888531
         ->  Sort  (cost=5591250.52..5684120.62 rows=37148040 width=50) (actual time=215854.323..221828.921 rows=1667147 loops=3)
               Sort Key: "ts"
               Sort Method: top-N heapsort  Memory: 924472kB
               Worker 0:  Sort Method: top-N heapsort  Memory: 924379kB
               Worker 1:  Sort Method: top-N heapsort  Memory: 924281kB
               Buffers: shared hit=12224 read=888531
               ->  Parallel Seq Scan on raw_data  (cost=0.00..1272141.40 rows=37148040 width=50) (actual time=25.899..107034.903 rows=29717641 loops=3)
                     Buffers: shared hit=12130 read=888531
 Planning Time: 0.058 ms
 JIT:
   Functions: 7
   Options: Inlining true, Optimization true, Expressions true, Deforming true
   Timing: Generation 0.642 ms, Inlining 53.860 ms, Optimization 23.848 ms, Emission 11.768 ms, Total 90.119 ms
 Execution Time: 300281.654 ms
(20 rows)

12
  • 3
    Please edit your question and add the execution plan generated using explain (analyze, buffers, format text) (not just a "simple" explain) as formatted text and make sure you preserve the indention of the plan. Paste the text, then put ``` on the line before the plan and on a line after the plan. Commented Nov 24, 2021 at 19:19
  • @jjanes - I'm trying to have it use the index so it is faster, which is the case for the 4e6 query. Commented Nov 24, 2021 at 19:30
  • You can force it to use the index by doing set enable_sort=off. But that is not a good permanent solution, so then show the ` explain (analyze, buffers, format text)` for the query run both ways, so we can advice on how to make it make better decisions on its own. Turn track_io_timing to on first, if it is not already on. Commented Nov 24, 2021 at 19:34
  • 1
    Have you tried including log_msg as a non-key index column? This could do something if the messages are small/short enough, but also slow things down otherwise. Commented Nov 24, 2021 at 19:40
  • Why do you think that using the index is always (!) faster? Without the results from the explain, there is just the assumption. Commented Nov 24, 2021 at 19:40

1 Answer 1

1

The problem here is you're going to PARALLEL SEQ SCAN and GATHER_MERGE. The gather merge is taking in 74,294,054 rows to output 5,000,000. Which makes sense, because you're saying there are 89,152,922 rows in the DB, and you have no conditional for which to limit them.

Why would it choose this plan, probably because it is forcing materialization because you're over work_mem. So increase your work_mem. If PostgreSQL thinks it can fit all this in memory and that it doesn't have to do this on disk then it will move massively faster.

Sign up to request clarification or add additional context in comments.

3 Comments

Then does work_mem need to be set to something greater than external merge Disk: 1758904kB to have the best results?
It may have to be set for more than that even. I'm not sure if that number is the swap space on the query (ie, after it has exhausted work_mem) or the total space including work_mem. But you can crank up your work_mem to an obscene number to figure out if that fixes it, and play with it from there.
I've updated the question to include the new setting for work_mem. However, it is still doing a Gather->Sort, and not an Index Scan. Is there another setting?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.