I'm looking at a table of orders for an ecommerce website and trying to build a customers table with some basic info about each customer.
I'm getting caught up when trying to use WINDOW functions like NTH_VALUE in combination with normal functions.
The orders table looks like this:
order_id | customer_id | order_date | revenue
----------------------------------------------
1 | 11 | 2017-01-01 | 5.0
2 | 11 | 2018-02-01 | 2.25
3 | 12 | 2019-03-01 | 1.0
4 | 13 | 2016-04-01 | 12.0
5 | 13 | 2016-05-01 | 15.25
6 | 13 | 2018-06-01 | 25.25
I'm looking to build a Customers table that looks like this:
customer_id | num_orders | first_order_date | first_order_revenue | second_order_date
--------------------------------------------------------------------------------------
11 | 2 | 2017-01-01 | 5.0 | 2018-02-01
12 | 1 | 2019-03-01 | 1.0 | n/a
13 | 3 | 2016-04-01 | 12.0 | 2018-06-01
My code should be something like this:
SELECT
customer_id,
COUNT(customer_id) num_orders,
MIN(order_date) first_order_date,
FIRST_VALUE(revenue) OVER w1 first_order_revenue,
NTH_VALUE(order_date, 2) OVER w1 second_order_date
FROM `orders`
GROUP BY customer_id
WINDOW w1 as (PARTITION BY customer_id ORDER BY order_date ASC)
But it's telling me I need to GROUP "revenue" and "order_date" via errors like this:
"SELECT list expression references column revenue which is neither grouped nor aggregated at [5:13]"
But when I do that, it returns a row for every order where first_order_date is different for each, first_order_revenue is the same (correct) value for each, and the second_order_date is correct except for the first row...where it is null:
customer_id | num_orders | first_order_date | first_order_revenue | second_order_date
--------------------------------------------------------------------------------------
13 | 1 | 2016-04-01 | 12.0 | *null*
13 | 1 | 2016-05-01 | 12.0 | 2016-05-01
13 | 1 | 2018-06-01 | 12.0 | 2016-05-01
I'm slowly teaching myself SQL but this specific issue I can't find any solutions for online. I'm guessing it might take a nested SELECT statement for the WINDOW functions that is then JOINed with the non-WINDOW functions? Something like that? I've tried a few different solutions but nothing is working so far.
Thank you for anyone that can help!