Merge continuous rows with Postgresql

Question

I have a slots table like this :

   Column   |            Type             |
------------+-----------------------------+
 id         | integer                     |
 begin_at   | timestamp without time zone |
 end_at     | timestamp without time zone |
 user_id    | integer                     |

and I like to select merged rows for continuous time. Let's say I have (simplified) data like :

(1, 5:15, 5:30, 1)
(2, 5:15, 5:30, 2)
(3, 5:30, 5:45, 2)
(4, 5:45, 6:00, 2)
(5, 8:15, 8:30, 2)
(6, 8:30, 8:45, 2)

I would like to know if it's possible to select rows formatted like :

(5:15, 5:30, 1)
(5:15, 6:00, 2) // <======= rows id 2,3 and 4 merged
(8:15, 8:45, 2) // <======= rows id 5 and 6 merged

EDIT: Here's the SQLfiddle

I'm using Postgresql, version 9.3!

Thank you!

Similar question for SQL Server stackoverflow.com/questions/7520556/… — Ram
– Ram, Commented Oct 12, 2014 at 21:04
Use a recursive query, start at front-unjoinable begin-stubs. Extend by joining following segment. Final select only the longest paths. — wildplasser
– wildplasser, Commented Oct 12, 2014 at 21:10

Gordon Linoff · Accepted Answer · 2014-10-12 21:09:22Z

15

Here is one method for solving this problem. Create a flag that determines if a one record does not overlap with the previous one. This is the start of a group. Then take the cumulative sum of this flag and use that for grouping:

select user_id, min(begin_at) as begin_at, max(end_at) as end_at
from (select s.*, sum(startflag) over (partition by user_id order by begin_at) as grp
      from (select s.*,
                   (case when lag(end_at) over (partition by user_id order by begin_at) >= begin_at
                         then 0 else 1
                    end) as startflag
            from slots s
           ) s
     ) s
group by user_id, grp;

Here is a SQL Fiddle.

answered Oct 12, 2014 at 21:09

Gordon Linoff

1.3m62 gold badges706 silver badges857 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

vyegorov · Accepted Answer · 2014-10-12 21:40:17Z

Gordon Linoff already provided the answer (I upvoted).

I've used the same approach, but wanted to deal with tsrange type. So I came up with this construct:

SELECT min(id) b_id, min(begin_at) b_at, max(end_at) e_at, grp, user_id
  FROM (
    SELECT t.*, sum(g) OVER (ORDER BY id) grp
      FROM (
        SELECT s.*, (NOT r -|- lag(r,1,r)
                     OVER (PARTITION BY user_id ORDER BY id))::int g
          FROM (SELECT id,begin_at,end_at,user_id,
                       tsrange(begin_at,end_at,'[)') r FROM slots) s
      ) t
  ) u
 GROUP BY grp, user_id
 ORDER BY grp;

Unfortunately, on the top level one has to use min(begin_at) and max(end_at), as there're no aggregate functions for the range-based union operator +.

I create ranges with exclusive upper bounds, this allows me to use “is adjacent to” (-|-) operator. I compare current tsrange with the one on the previous row, defaulting to the current one in case there's no previous. Then I negate the comparison and cast to integer, which gives me 1 in cases when new group starts.

Collectives™ on Stack Overflow

Merge continuous rows with Postgresql

2 Answers 2

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related