1

I have simple table of Time and Task columns where it is ordered by Time and has associated task to it as per below example. Question: How to group the tasks and get the earliest time of every time the task is changing?

SELECT 
    "Time",
    "Task"
FROM my_query
ORDER BY
    "Time"
Time Task
06:06:41 Repack
06:06:44 Vehicle Load
06:06:45 Vehicle Load
06:06:46 Vehicle Load
06:10:47 Repack
06:10:48 Vehicle Load
06:10:51 Vehicle Load
06:10:52 Vehicle Load
06:15:45 Repack
06:16:13 Repack
06:24:30 Repack
06:24:34 Vehicle Load
06:26:11 Repack
06:26:12 Vehicle Load
06:26:13 Vehicle Load
06:29:33 Relocate
06:46:07 Repack
06:46:45 Repack
07:01:54 Pick
07:04:12 Pick
07:04:26 Marshal
07:06:28 Pick
07:17:06 Repack
07:17:10 Vehicle Load
07:17:11 Vehicle Load
07:21:55 Repack

The output is required:

Time Task
06:06:41 Repack
06:06:44 Vehicle Load
06:10:47 Repack
06:10:48 Vehicle Load
06:15:45 Repack

and so on...

I have tried DENSE_RANK(), ROW_NUMBER() functions without any luck.. Please help

2 Answers 2

1

You can use the LAG analytic function to get the previous Task and then find when that changes:

SELECT Time,
       Task
FROM   (
  SELECT Time,
         Task,
         LAG(task) OVER (ORDER BY time) AS prev_task
  FROM my_query
)
WHERE  task != prev_task
OR     prev_task IS NULL

Or, from Oracle 12, you can use MATCH_RECOGINZE to do row-by-row pattern matching:

SELECT time, Task
FROM   my_query
       MATCH_RECOGNIZE(
         ORDER BY time
         ALL ROWS PER MATCH
         PATTERN (not_same)
         DEFINE not_same AS task != PREV(task) OR PREV(task) IS NULL
       )

Or, if you did want to use ROW_NUMBER then you can use:

SELECT time,
       Task
FROM   (
  SELECT time,
         Task,
         ROW_NUMBER() OVER (PARTITION BY task, grp ORDER BY time) AS rn
  FROM   (
    SELECT Time,
           Task,
           ROW_NUMBER() OVER (ORDER BY time)
             - ROW_NUMBER() OVER (PARTITION BY task ORDER BY time) AS grp
    FROM   my_query
  )
)
WHERE rn = 1
ORDER BY time

(But the first two are probably going to be more efficient.)

Which, for the sample data:

CREATE TABLE my_query (Time, Task) AS
SELECT INTERVAL '06:06:41' HOUR TO SECOND, 'Repack'       FROM DUAL UNION ALL
SELECT INTERVAL '06:06:44' HOUR TO SECOND, 'Vehicle Load' FROM DUAL UNION ALL
SELECT INTERVAL '06:06:45' HOUR TO SECOND, 'Vehicle Load' FROM DUAL UNION ALL
SELECT INTERVAL '06:06:46' HOUR TO SECOND, 'Vehicle Load' FROM DUAL UNION ALL
SELECT INTERVAL '06:10:47' HOUR TO SECOND, 'Repack'       FROM DUAL UNION ALL
SELECT INTERVAL '06:10:48' HOUR TO SECOND, 'Vehicle Load' FROM DUAL UNION ALL
SELECT INTERVAL '06:10:51' HOUR TO SECOND, 'Vehicle Load' FROM DUAL UNION ALL
SELECT INTERVAL '06:10:52' HOUR TO SECOND, 'Vehicle Load' FROM DUAL UNION ALL
SELECT INTERVAL '06:15:45' HOUR TO SECOND, 'Repack'       FROM DUAL UNION ALL
SELECT INTERVAL '06:16:13' HOUR TO SECOND, 'Repack'       FROM DUAL UNION ALL
SELECT INTERVAL '06:24:30' HOUR TO SECOND, 'Repack'       FROM DUAL UNION ALL
SELECT INTERVAL '06:24:34' HOUR TO SECOND, 'Vehicle Load' FROM DUAL UNION ALL
SELECT INTERVAL '06:26:11' HOUR TO SECOND, 'Repack'       FROM DUAL UNION ALL
SELECT INTERVAL '06:26:12' HOUR TO SECOND, 'Vehicle Load' FROM DUAL UNION ALL
SELECT INTERVAL '06:26:13' HOUR TO SECOND, 'Vehicle Load' FROM DUAL UNION ALL
SELECT INTERVAL '06:29:33' HOUR TO SECOND, 'Relocate'     FROM DUAL UNION ALL
SELECT INTERVAL '06:46:07' HOUR TO SECOND, 'Repack'       FROM DUAL UNION ALL
SELECT INTERVAL '06:46:45' HOUR TO SECOND, 'Repack'       FROM DUAL UNION ALL
SELECT INTERVAL '07:01:54' HOUR TO SECOND, 'Pick'         FROM DUAL UNION ALL
SELECT INTERVAL '07:04:12' HOUR TO SECOND, 'Pick'         FROM DUAL UNION ALL
SELECT INTERVAL '07:04:26' HOUR TO SECOND, 'Marshal'      FROM DUAL UNION ALL
SELECT INTERVAL '07:06:28' HOUR TO SECOND, 'Pick'         FROM DUAL UNION ALL
SELECT INTERVAL '07:17:06' HOUR TO SECOND, 'Repack'       FROM DUAL UNION ALL
SELECT INTERVAL '07:17:10' HOUR TO SECOND, 'Vehicle Load' FROM DUAL UNION ALL
SELECT INTERVAL '07:17:11' HOUR TO SECOND, 'Vehicle Load' FROM DUAL UNION ALL
SELECT INTERVAL '07:21:55' HOUR TO SECOND, 'Repack'       FROM DUAL;

All output:

TIME TASK
+00 06:06:41.000000 Repack
+00 06:06:44.000000 Vehicle Load
+00 06:10:47.000000 Repack
+00 06:10:48.000000 Vehicle Load
+00 06:15:45.000000 Repack
+00 06:24:34.000000 Vehicle Load
+00 06:26:11.000000 Repack
+00 06:26:12.000000 Vehicle Load
+00 06:29:33.000000 Relocate
+00 06:46:07.000000 Repack
+00 07:01:54.000000 Pick
+00 07:04:26.000000 Marshal
+00 07:06:28.000000 Pick
+00 07:17:06.000000 Repack
+00 07:17:10.000000 Vehicle Load
+00 07:21:55.000000 Repack

fiddle

Sign up to request clarification or add additional context in comments.

Comments

0

The following should do the trick (in any DBMS):

WITH ranked_tasks AS (
  SELECT 
    "Time",
    "Task",
    LAG("Task") OVER (ORDER BY "Time") AS prev_task
  FROM my_query
)
SELECT 
  "Time",
  "Task"
FROM ranked_tasks
WHERE "Task" != prev_task OR prev_task IS NULL
ORDER BY "Time"

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.