0

In the query below I'm trying to select data but need it to be distinct on the columns Book.Title and the OrderDate. I have tried using DISTINCT ON but also get multiple results back which I will also post below. Sorry if the question is a bit trivial. I'm currently in the process of learning SQL so I'm a bit of a noob when it comes to what is probably something really obvious I have missed.

How could I change the query to get the results to be formatted as so:

   month   |                    title                    | quantity | total_value 
-----------+---------------------------------------------+----------+-------------
 February  | Internet and World Wide Web: How to Program |       15 |      899.70
 March     | C How To Program                            |        4 |      183.92
 March     | Core Servlets and JavaServer Pages          |       13 |      856.70
 March     | Internet and World Wide Web: How to Program |       21 |     1071.58

The query I constructed:

SELECT DISTINCT ON (OrderDate, Book.Title) Book.Title, to_char(OrderDate, 'Month') AS "Order Date", 
OrderLine.Quantity AS "Order Quantity", (SUM(Quantity*UnitSellingPrice)) AS "Total Value"
FROM Book
INNER JOIN Publisher
ON Book.PublisherID=Publisher.PublisherID
INNER JOIN OrderLine
ON Book.BookID=OrderLine.BookID
INNER JOIN ShopOrder
ON OrderLine.ShopOrderID=ShopOrder.ShopOrderID
WHERE Publisher.Name='Prentice Hall'
GROUP BY book.title, OrderDate, orderline.quantity
ORDER BY OrderDate ASC;

The results I get:

                    title                    | Order Date | Order Quantity | Total Value 
---------------------------------------------+------------+----------------+-------------
 Internet and World Wide Web: How to Program | February   |             10 |      299.90
 Internet and World Wide Web: How to Program | February   |              5 |      149.95
 C How To Program                            | March      |              3 |       68.97
 Core Servlets and JavaServer Pages          | March      |             10 |      329.50
 Internet and World Wide Web: How to Program | March      |             20 |      519.80
 C How To Program                            | March      |              1 |       22.99
 Core Servlets and JavaServer Pages          | March      |              3 |       98.85
 Internet and World Wide Web: How to Program | March      |              1 |       15.99

Thanks so much for any help!

0

2 Answers 2

2

I think you're over-complicating things here: if you want the total quantity for each title and month, then you want to group by the title and month, and sum the quantity. You don't want to group by the quantity itself, because that would mean a new row for each distinct quantity.

That's easy to do in standard SQL (DISTINCT ON is a Postgres extension, useful in a few cases instead of GROUP BY, but unnecessary here):

SELECT
    Book.Title,
    to_char(OrderDate, 'Month') AS "Order Date", 
    SUM(OrderLine.Quantity) AS "Order Quantity", 
    SUM(Quantity*UnitSellingPrice) AS "Total Value"
FROM [...]
WHERE Publisher.Name='Prentice Hall'
GROUP BY
    book.title,
    to_char(OrderDate, 'Month')
ORDER BY to_char(OrderDate, 'Month') ASC;

The rule of thumb is that everything in your SELECT clause - and your ORDER BY clause, which is considered to be "at the same level" - should be either:

  • listed in the GROUP BY clause because you want a new row every time it varies
  • or wrapped in an aggregate function (here SUM()) because you want to "roll it up" according to some rule.
Sign up to request clarification or add additional context in comments.

2 Comments

Thanks so much for explaining in detail. My idea of GROUP BY has been completely wrong in my head XD. Explaining it like that helped a bunch. I know I should read the Documentation in more detail but I really learn from practical hands-on learning. I do try to not resorting to asking on here when I can. Seeing as I guess constant trivial posts (Probably like mine) are pointless at times :).
No probs - GROUP BY seems to confuse a lot of people at first, but once you "get it", it has a logic to it which I find quite elegant. Sometimes it just won't do what you need to do, but once you understand why it won't, you can start approaching those problems from a different direction.
1

To see what is going on, consider:

SELECT DISTINCT ON (OrderDate, Book.Title)
    Book.Title,
    OrderDate as "Actual Order Date",
    to_char(OrderDate, 'Month') AS "Order Month", ..

That is, the result in the post SELECTs a different "Order Date", really the order month, but is not the actual OrderDate value used in the DISTINCT/GROUP BY process which explains both the duplicates and incorrect aggregate results.

I suspect the desired query goes something like the following, and I've yet to find a need to use DISTINCT with GROUP BY.

SELECT
  -- These are in the GROUP BY and will be DISTINCT in the result
  Book.Title,
  to_char(OrderDate, 'Month') AS "Order Month", 
  -- Non GROUP BY values should be aggregated;
  -- In this case the aggregates are over Title/Month
  SUM(OrderLine.Quantity) AS "Order Quantity",
  SUM(Quantity*UnitSellingPrice) AS "Total Value"
FROM Book
INNER JOIN Publisher
ON Book.PublisherID=Publisher.PublisherID
INNER JOIN OrderLine
ON Book.BookID=OrderLine.BookID
INNER JOIN ShopOrder
ON OrderLine.ShopOrderID=ShopOrder.ShopOrderID
WHERE Publisher.Name='Prentice Hall'
-- For each Title/Month
GROUP BY book.title,
         to_char(OrderDate, 'Month')       -- also grouped on Month
ORDER BY to_char(OrderDate, 'Month') ASC;  -- and sorted on Month

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.