Slow query postgreSQL

Question

I have a trouble with a "sub select" query into a query :

select
 f.timestamp::date as date,
   user_id,
   activity_type,
   f.container_id as group_id,
   (
      select
          string_agg(distinct("userId"), ',') as group_owners
        from
          jusers_groups_copy g
        where
          g.place_id = f.container_id
          and state like 'owner'
    ) as group_owners
 from
   fact_activity f
 where
   f.container_type like '700'
   and f.timestamp::date < to_date('2016-09-05', 'YYYY-MM-DD')
 group by
  date, user_id, activity_type, group_id
 order by
  date, user_id, activity_type, group_id

Indeed, the string_add inside takes like 20 seconds to be rendered. I used pgAdmin to explain the query and he gives me this message :

"Group  (cost=7029.62..651968.20 rows=17843 width=27) (actual time=431.017..4513.973 rows=11483 loops=1)"
"  Buffers: shared hit=139498 read=411, temp read=255 written=255"
"  ->  Sort  (cost=7029.62..7074.90 rows=18111 width=27) (actual time=430.630..667.098 rows=54660 loops=1)"
"        Sort Key: ((f."timestamp")::date), f.user_id, f.activity_type, f.container_id"
"        Sort Method: external merge  Disk: 2008kB"
"        Buffers: shared hit=1702 read=411, temp read=255 written=255"
"        ->  Seq Scan on fact_activity f  (cost=0.00..5748.76 rows=18111 width=27) (actual time=0.107..188.827 rows=54660 loops=1)"
"              Filter: ((container_type ~~ '700'::text) AND (("timestamp")::date < to_date('2016-09-05'::text, 'YYYY-MM-DD'::text)))"
"              Rows Removed by Filter: 125414"
"              Buffers: shared hit=1691 read=411"
"  SubPlan 1"
"    ->  Aggregate  (cost=36.12..36.13 rows=1 width=5) (actual time=0.315..0.318 rows=1 loops=11483)"
"          Buffers: shared hit=137796"
"          ->  Seq Scan on users_groups_copy g  (cost=0.00..36.09 rows=11 width=5) (actual time=0.041..0.266 rows=13 loops=11483)"
"                Filter: ((state ~~ 'owner'::text) AND (place_id = f.container_id))"
"                Rows Removed by Filter: 1593"
"                Buffers: shared hit=137796"
"Total runtime: 4536.074 ms"

Moreover, I tried to join the tables but the request is way more slower, like this :

select
 f.timestamp::date as date,
   user_id,
   activity_type,
   f.container_id as group_id,
   string_agg(distinct("userId"), ',') as group_owners
 from
   fact_activity f
 join jusers_groups_copy g
 on g.place_id = f.container_id
 where
   f.container_type like '700'
   and f.timestamp::date < to_date('2016-09-05', 'YYYY-MM-DD')
   and g.state like 'owner'
 group by
  date, user_id, activity_type, group_id
 order by
  date, user_id, activity_type, group_id

Finally, there is any indexes into this database, is it why the request is that slow ?

I'd like to know how to improve this request.

Thanks in advance

this is not the main reason, but no reason to use "like" without wildcards: f.container_type like '700' — Zabi
– Zabi, Commented Sep 5, 2016 at 14:59
Try an index on fact_activity(container_type, timestamp) and use container_type = instead of container_type like and f.timestamp < to_timestamp('2016-09-05', 'YYYY-MM-DD') — user330315
– user330315, Commented Sep 5, 2016 at 15:01
i think you want indexes on both these, too, right? : g.place_id , f.container_id — Zabi
– Zabi, Commented Sep 5, 2016 at 15:02

Laurenz Albe · Accepted Answer · 2016-09-06 07:37:57Z

1

The biggest performance improvement without changing the query would be an index on the table in the subselect that speeds up the subselect:

CREATE INDEX nice_name ON jusers_groups_copy(place_id, state text_pattern_ops);

But I would rewrite the query as a join. That way you might get something more efficient than a nested loop, depending on your data.

Instead of

SELECT f.somecol,
   (SELECT g.othercol
    FROM jusers_groups_copy g
    WHERE g.place_id = f.container_id
      AND g.state LIKE 'owner')
FROM fact_activity f
WHERE ...;

you should write

SELECT f.somecol, g.othercol
FROM fact_activity f
   JOIN jusers_groups_copy g
      ON g.place_id = f.container_id
WHERE g.state LIKE 'owner'
  AND ...;

Depending on the join type selected, the index above (for a nested loop) or a different index can make that query fast.

answered Sep 6, 2016 at 7:37

Laurenz Albe

257k22 gold badges313 silver badges389 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

BOUKANDOURA Mhamed · Accepted Answer · 2016-09-06 09:17:53Z

0

I guess you need to change some configuration in /data/postgresql.conf use the following website

pgtune

I think the most important parametres is "work_mem"

answered Sep 6, 2016 at 9:17

BOUKANDOURA Mhamed

1,1112 gold badges12 silver badges28 bronze badges

Collectives™ on Stack Overflow

Slow query postgreSQL

2 Answers 2

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related