How to get Postgres to return 0 for empty rows

Question

I have a query which get data summarised between two dates like so:

SELECT date(created_at),
COUNT(COALESCE(id, 0))                                                                      AS total_orders,
   SUM(COALESCE(total_price, 0))                                                               AS total_price,
   SUM(COALESCE(taxes, 0))                                                                     AS taxes,
   SUM(COALESCE(shipping, 0))                                                                  AS shipping,
   AVG(COALESCE(total_price, 0))                                                               AS average_order_value,
   SUM(COALESCE(total_discount, 0))                                                            AS total_discount,
   SUM(total_price - COALESCE(taxes, 0) - COALESCE(shipping, 0) - COALESCE(total_discount, 0)) as net_sales
FROM orders
WHERE shop_id = 43
  AND orders.active = true
  AND orders.created_at >= '2022-07-20'
  AND orders.created_at <= '2022-07-26'
GROUP BY date (created_at)
order by created_at::date desc

However for dates that do not have any orders, the query returns nothing and I'd like to return 0.

I have tried with COALESCE but that doesn't seem to do the trick?

Any suggestions?

What date value do you want to return in the case of a 0 count? — Tim Biegeleisen
– Tim Biegeleisen, Commented Jul 25, 2022 at 7:01
Use generate_seies and left join to the resultant list of dates. — Jonathan Willcock
– Jonathan Willcock, Commented Jul 25, 2022 at 7:10
You can use a simple nested query and count the resulting rows if it's 0 pick the default values in the outer query. — visrey
– visrey, Commented Jul 25, 2022 at 7:10
@TimBiegeleisen 0 would be ideal. I'm making a graph with the result, so I can't have missing dates. — WagnerMatosUK
– WagnerMatosUK, Commented Jul 25, 2022 at 7:47
This sounds like something you should handle in the presentation layer which is calling Postgres. Just check for an empty result set, in that case, render a different graph. — Tim Biegeleisen
– Tim Biegeleisen, Commented Jul 25, 2022 at 7:48

Erwin Brandstetter · Accepted Answer · 2022-07-26 12:07:23Z

This should be substantially faster - and correct:

SELECT *
     , total_price - taxes - shipping - total_discount AS net_sales  -- ⑤
FROM  (
   SELECT created_at
        , COALESCE(total_orders        , 0) AS total_orders
        , COALESCE(total_price         , 0) AS total_price
        , COALESCE(taxes               , 0) AS taxes
        , COALESCE(shipping            , 0) AS shipping
        , COALESCE(average_order_value , 0) AS average_order_value
        , COALESCE(total_discount      , 0) AS total_discount
   FROM   generate_series(timestamp '2022-07-20'  -- ①
                        , timestamp '2022-07-26'
                        , interval '1 day') AS g(created_at)
   LEFT  JOIN (  -- ③
      SELECT created_at::date
           , count(*)            AS total_orders  -- ⑥
           , sum(total_price)    AS total_price
           , sum(taxes)          AS taxes
           , sum(shipping)       AS shipping
           , avg(total_price)    AS average_order_value
           , sum(total_discount) AS total_discount
      FROM   orders
      WHERE  shop_id = 43
      AND    active  -- simpler
      AND    created_at >= '2022-07-20'
      AND    created_at <  '2022-07-27'  -- ② !
      GROUP  BY 1
      ) o USING (created_at)  -- ④
   ) sub
ORDER  BY created_at DESC;

db<>fiddle here

I copied, simplified, and extended Xu's fiddle for comparison.

① Why this particular form for generate_series()? See:

Generating time series between two dates in PostgreSQL

② Assuming created_at is data type timestamp your original formulation is most probably incorrect. created_at <= '2022-07-26' would include the first instant of '2022-07-26' and exclude the rest. To include all of '2022-07-26', use created_at < '2022-07-27'. See:

How do I write a function in plpgsql that compares a date with a timestamp without time zone?

③ The LEFT JOIN is the core feature of this answer. Generate all days with generate_series(), independently aggregate days from table orders, then LEFT JOIN to retain one row per day like you requested.

④ I made the column name match created_at, so we can conveniently shorten the join syntax with the USING clause.

⑤ Compute net_sales in an outer SELECT after replacing NULL values, so we need COALESCE() only once.

⑥ count(*) is equivalent to COUNT(COALESCE(id, 0)) in any case, but cheaper. See:

Courser Xu · Accepted Answer · 2022-07-25 11:38:39Z

1

Please refer to the below script.

SELECT *
FROM 
    (SELECT date(created_at) AS created_at,
         COUNT(id) AS total_orders,
         SUM(total_price) AS total_price,
         SUM(taxes) AS taxes,
         SUM(shipping) AS shipping,
         AVG(total_price) AS average_order_value,
         SUM(total_discount) AS total_discount,
         SUM(total_price - taxes - shipping - total_discount) AS net_sales
    FROM orders
    WHERE shop_id = 43
        AND orders.active = true
        AND orders.created_at >= '2022-07-20'
        AND orders.created_at <= '2022-07-26'
    GROUP BY  date (created_at)
UNION
SELECT dates AS created_at,
         0 AS total_orders,
         0 AS total_price,
         0 AS taxes,
         0 AS shipping,
         0 AS average_order_value,
         0 AS total_discount,
         0 AS net_sales
FROM generate_series('2022-07-20', '2022-07-26', interval '1 day') AS dates
WHERE dates NOT IN 
    (SELECT created_at
    FROM orders
    WHERE shop_id = 43
        AND orders.active = true
        AND orders.created_at >= '2022-07-20'
        AND orders.created_at <= '2022-07-26' ) ) a
ORDER BY  created_at::date desc;

There is one sample for your reference. Sample

I got your duplicate test cases at my side. The root cause is created_at field (datattype:timestamp), hence there are duplicate lines.

Below script is correct for your request.

SELECT *
FROM 
    (SELECT date(created_at) AS created_at,
         COUNT(id) AS total_orders,
         SUM(total_price) AS total_price,
         SUM(taxes) AS taxes,
         SUM(shipping) AS shipping,
         AVG(total_price) AS average_order_value,
         SUM(total_discount) AS total_discount,
         SUM(total_price - taxes - shipping - total_discount) AS net_sales
    FROM orders
    WHERE shop_id = 43
        AND orders.active = true
        AND orders.created_at >= '2022-07-20'
        AND orders.created_at <= '2022-07-26'
    GROUP BY  date (created_at)
UNION
SELECT dates AS created_at,
         0 AS total_orders,
         0 AS total_price,
         0 AS taxes,
         0 AS shipping,
         0 AS average_order_value,
         0 AS total_discount,
         0 AS net_sales
FROM generate_series('2022-07-20', '2022-07-26', interval '1 day') AS dates
WHERE dates NOT IN 
    (SELECT date (created_at)
    FROM orders
    WHERE shop_id = 43
        AND orders.active = true
        AND orders.created_at >= '2022-07-20'
        AND orders.created_at <= '2022-07-26' ) ) a
ORDER BY  created_at::date desc;

Here is a sample that's same with your side. Link

edited Jul 25, 2022 at 11:38

answered Jul 25, 2022 at 7:32

Courser Xu

1424 bronze badges

5 Comments

WagnerMatosUK Over a year ago

you suggestion looks promising but it repeats the dates (for the ones where there are orders, I have two rows, one with data and one empty)

Courser Xu Over a year ago

Please use the corrected script.

WagnerMatosUK Over a year ago

still shows duplicated. There was a duplicated ORDER BY which I removed as it was causing an error. Could have it been related?

Courser Xu Over a year ago

Sorry. ORDER BY is duplicate. I modify “Union All” to "Union" for your case. Also, I create some data for testing, refer to them,

WagnerMatosUK Over a year ago

Thanks for the help and no need to apologise. I still get the duplicates though :/

Siddheshwar Soni · Accepted Answer · 2022-07-25 07:05:26Z

0

You can use WITH RECURSIVE to build a table of dates and then select dates that are not in your table

WITH RECURSIVE t(d) AS (
  (SELECT '2015-01-01'::date)
UNION ALL
  (SELECT d + 1 FROM t WHERE d + 1 <= '2015-01-10')
) SELECT d FROM t WHERE d NOT IN (SELECT d_date FROM tbl);

[look on this post : ][1]

[1]: https://stackoverflow.com/questions/28583379/find-missing-dates-postgresql#:~:text=You%20can%20use%20WITH%20RECURSIVE,SELECT%20d_date%20FROM%20tbl)%3B

answered Jul 25, 2022 at 7:05

Siddheshwar Soni

4804 silver badges11 bronze badges

1 Comment

user330315 Over a year ago

generate_series() is much easier to use and probably more efficient as well

Collectives™ on Stack Overflow

How to get Postgres to return 0 for empty rows

3 Answers 3

Comments

5 Comments

1 Comment

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

5 Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Linked

Related