0

I have a query as follows:

SELECT MAX(c."Sequence") 
FROM "Cips" AS c 
WHERE c."StoreNumber" IN (1, 2) 
AND c."DataProvider" in ('MCIP'  , 'SAM')

I also run the explain analyse and I receive the following answer:

"Aggregate  (cost=43628.91..43628.92 rows=1 width=8) (actual time=81.290..81.292 rows=1 loops=1)"
"  ->  Append  (cost=0.43..43498.29 rows=52248 width=8) (actual time=0.090..75.045 rows=61163 loops=1)"
"        ->  Index Scan using ""a_StoreNumber_DataProvider_idx"" on a c  (cost=0.43..43237.05 rows=52248 width=8) (actual time=0.089..67.541 rows=61163 loops=1)"
"              Index Cond: ((""StoreNumber"" = ANY ('{-1,1}'::integer[])) AND (""DataProvider"" = ANY ('{MCIP,SAM}'::text[])))"
"Planning Time: 0.677 ms"
"Execution Time: 81.366 ms"

I have only one index defined on this table:

CREATE INDEX "idx_Cip_StoreNumber_Sequence_DataProvider"
    ON public."Cips" USING btree
    ("StoreNumber" ASC NULLS LAST, "Sequence" ASC NULLS LAST, "DataProvider" COLLATE pg_catalog."default" ASC NULLS LAST)
    TABLESPACE pg_default;

My table row count is somewhere around 1.4 mil rows. I also tried to partition the table, but with 10% better performance. My problem is that I need to improve the performance by 90% and I really am stuck.

I was wondering if I can improve the performance of this query or should I start to look in some other direction, for example to modify the architecture around this field and how to we get the max value?

4
  • 1
    81 milliseconds doesn't sound that bad. How fast do you need that to be? Commented Feb 25, 2022 at 19:38
  • 1
    Your plan says the index is called a_StoreNumber_DataProvider_idx but the CREATE INDEX says otherwise. Also, 81ms to find a result from 1.4M rows seems fair. How did you partition the data? Commented Feb 25, 2022 at 19:50
  • I partition it by column data provider in which I have 3 values mcip sam and cip. The problem is that I need to reduce it to 30 ms maximum. Commented Feb 25, 2022 at 19:57
  • Using not exists(...) or row_count() could possibly avoid the append+aggregate. BTW: please add some DDL to your question. Commented Feb 26, 2022 at 12:15

1 Answer 1

2

Try a different index, where you have the columns ordered in the way you need them for this query:

CREATE INDEX "idx_Cip_StoreNumber_DataProvider_Sequence"
    ON public."Cips" USING btree
    ("StoreNumber" ASC NULLS LAST, "DataProvider" COLLATE pg_catalog."default" ASC NULLS LAST, "Sequence" DESC NULLS LAST, )
    TABLESPACE pg_default;

Sequence is now the last column in the index, and also sorted descending but NULLS last.

Does this improve the query plan? Please use EXPLAIN(ANALYZE, VERBOSE, BUFFERS) to get the complete plan.

Sign up to request clarification or add additional context in comments.

3 Comments

I doubt he will get a performance benefit from it being already sorted by sequence (In theory, PostgreSQL could derive a benefit from that, but it is not clever enough to do so) but will definitely benefit from not having the sequence intrude into the usefulness of the DataProvider equality conditions.
@jjanes: I would expect at least an index only scan. Let’s see what the new query plan shows us
This looks great: "Aggregate (cost=1105.55..1105.56 rows=1 width=8) (actual time=6.483..6.485 rows=1 loops=1)" " -> Index Only Scan using ""idx_Cip_StoreNumber_DataProvider_Sequence"" on ""Cips"" c (cost=0.43..1003.90 rows=40659 width=8) (actual time=0.083..5.051 rows=17508 loops=1)" " Index Cond: ((""StoreNumber"" = ANY ('{1,2}'::integer[])) AND (""DataProvider"" = ANY ('{MCIP,SAM}'::text[])))" " Heap Fetches: 0" "Planning Time: 0.290 ms" "Execution Time: 6.533 ms" you save my week, I think I need to invest more time in my sql skills:D

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.