1
\$\begingroup\$

The challenge is as follows: "Julia conducted a 15 days of learning SQL contest. The start date of the contest was March 01, 2016 and the end date was March 15, 2016.

Write a query to print total number of unique hackers who made at least one submission each day (starting on the first day of the contest), and find the hacker_id and name of the hacker who made maximum number of submissions each day. If more than one such hacker has a maximum number of submissions, print the lowest hacker_id. The query should print this information for each day of the contest, sorted by the date." Details are here.

My solution was accepted, and I got a full score. However, my solution has 7 queries and 1 while loop. Since I mostly use python and C++, this is a natural way for me, and I find it more readable than just one very big query. Is it ok in this instance? Any other comment is welcome, too.

DECLARE @TODAY DATETIME = '2016-03-01', @STOP_DATE DATETIME = '2016-03-15', @TOMORROW DATETIME
SELECT * INTO #TEMP_CONSECUTIVE FROM (
    SELECT submission_date, hacker_id AS h_id, COUNT(*) AS COUNT_ATTEMPTS FROM Submissions WHERE submission_date = @TODAY     GROUP BY submission_date, hacker_id
)
AS X

WHILE (@TODAY < @STOP_DATE) 
BEGIN
   SET @TOMORROW = DATEADD(DAY, 1, @TODAY)
   INSERT INTO #TEMP_CONSECUTIVE (submission_date, h_id, COUNT_ATTEMPTS) SELECT submission_date, hacker_id, COUNT(*) FROM Submissions WHERE submission_date = @TOMORROW AND EXISTS(SELECT * FROM #TEMP_CONSECUTIVE WHERE submission_date=@TODAY AND h_id=hacker_id) GROUP BY submission_date, hacker_id
   SET @TODAY = @TOMORROW
END

SELECT * INTO #TEMP_ATTEMPTS_COUNT FROM (SELECT submission_date, hacker_id AS h_id, COUNT(*) AS COUNT_ATTEMPTS FROM Submissions --WHERE submission_date = @TOMORROW 
    GROUP BY submission_date, hacker_id) AS U

SELECT submission_date, MAX_COUNT_ATTEMPTS INTO #TEMP_MAX FROM (
  SELECT submission_date, MAX(COUNT_ATTEMPTS) AS MAX_COUNT_ATTEMPTS FROM #TEMP_ATTEMPTS_COUNT GROUP BY submission_date 
) AS Y

SELECT submission_date, COUNT_MAX_COUNT_ATTEMPTS, HACKER_ID, HACKER_NAME INTO #TEMP_RESULT FROM (
   SELECT TM.submission_date, COUNT(DISTINCT TC.h_id) AS COUNT_MAX_COUNT_ATTEMPTS, MIN(TAC.h_id) AS HACKER_ID, CAST('' AS NVARCHAR(MAX)) AS HACKER_NAME 
  FROM #TEMP_MAX TM INNER JOIN #TEMP_ATTEMPTS_COUNT TAC ON (TM.submission_date=TAC.submission_date AND TM.MAX_COUNT_ATTEMPTS=TAC.COUNT_ATTEMPTS) 
  INNER JOIN #TEMP_CONSECUTIVE TC ON (TM.submission_date=TC.submission_date) 
  GROUP BY TAC.submission_date, TAC.COUNT_ATTEMPTS, TM.submission_date, TM.MAX_COUNT_ATTEMPTS
) AS Z

UPDATE TR SET TR.HACKER_NAME=H.NAME FROM HACKERS H INNER JOIN  #TEMP_RESULT TR ON (TR.HACKER_ID = H.HACKER_ID)

SELECT * FROM #TEMP_RESULT  ORDER BY submission_date
\$\endgroup\$

1 Answer 1

2
\$\begingroup\$

No; this doesn't look like a good way to use SQL.

Use of WHILE loops, temp tables, etc are sometimes helpful for performance reasons, but they're very rarely needed to express yourself in SQL, and used naively they will generally make performance worse.

Idiomatic SQL is "declarative", meaning you say what the result of the computation should be, and as much as practical you let the engine worry about how to compute it. In this case, you've been asked to do two things (list who submitted every day, and list who submitted the most each day), so I would expect to see just two "top level" queries. Different SQL variants will give you different options for nesting or breaking out sub-queries, but it would look roughly like

DECLARE @START_DATE DATETIME = '2016-03-01'
DECLARE @STOP_DATE DATETIME = '2016-03-15'

SELECT h.hacker_id                                   -- 1) Tell me IDs
  FROM (                                             -- 4) A daily submission is
    SELECT s.hacker_id                               -- 5) just the ID
      FROM Submissions AS s                          -- 6) of a submission
      GROUP BY s.hacker_id, DATE(s.submission_date)  -- 7) de-duplicated by date.
  ) AS h
  GROUP BY h.hacker_id                               -- 2) de-duplicated
  HAVING DAYS(@START_DATE, @END_DATE) = COUNT(*)     -- 3) with daily submissions.

WITH scores AS (                                     -- 01) everyone's daily scores are
  SELECT s.hacker_id                                 -- 02) their id,
        ,DATE(s.submission_date) AS date             -- 03) the date,
        ,COUNT(*) AS score                           -- 04) and their submission count
    FROM Submissions AS s
    GROUP BY date, s.hacker_id                       -- 05) for every hacker every day.
)
SELECT s.date, MIN(s.hacker_id)                      -- 14) and print the lowest ID.
  FROM scores AS s                                   -- 11) get the daily scores
  INNER JOIN (                                       -- 06) The high scores are
    SELECT hs.date, MAX(hs.score) AS score           -- 07) the date and highest score
      FROM scores AS hs
      GROUP BY hs.date                               -- 08) each day.
  ) AS high_scores
    ON s.date = high_scores.date                     -- 12) for that day
    AND s.score = high_scores.score                  -- 13) that are the high score,
  GROUP BY s.date                                    -- 09) For each day
  ORDER BY s.date                                    -- 10) in order
\$\endgroup\$

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.