8

I have two Postgres tables (really, more than that, but simplified for the purpose of the question) - one a record of products that have been ordered by customers, and another a historical record of prices per customer and a date they went into effect. Something like this:

'orders' table

customer_id | timestamp           | quantity
------------+---------------------+---------
1           | 2015-09-29 16:01:01 | 5
1           | 2015-10-23 14:33:36 | 3
2           | 2015-10-19 09:43:02 | 7
1           | 2015-11-16 15:08:32 | 2

'prices' table

customer_id | effective_time      | price
------------+---------------------+-------
1           | 2015-01-01 00:00:00 | 15.00
1           | 2015-10-01 00:00:00 | 12.00
2           | 2015-01-01 00:00:00 | 14.00

I'm trying to create a query that will return every order and its unit price for that customer at the time of the order, like this:

desired result

customer_id | quantity | price
------------+----------+------
1           | 5        | 15.00
1           | 3        | 12.00
2           | 7        | 14.00
1           | 2        | 12.00

This is essentially what I want, but I know that you can't reference an outer query inside an inner query, and I'm having trouble figuring out how to re-factor:

SELECT
    o.customer_id,
    o.quantity,
    p.price
FROM orders o
    INNER JOIN (
        SELECT price
        FROM prices x
        WHERE x.customer_id = o.customer_id
            AND x.effective_time <= o.timestamp
        ORDER BY x.effective_time DESC
        LIMIT 1
    ) p
;

Can anyone suggest the best way to make this work?

1
  • 3
    You can reference the orders table in the join query if you use join lateral. Commented Nov 16, 2015 at 21:26

2 Answers 2

6

Instead of joining an inline view based on the prices table, you can perform a subquery in the SELECT list:

SELECT customer_id, quantity, (
    SELECT price
    FROM prices p
    WHERE
      p.customer_id = o.customer_id
        AND p.effective_time <= o.timestamp
    ORDER BY p.effective_time DESC
    LIMIT 1
  ) AS price
FROM orders o

That does rely on a correlated subquery, which could be bad for performance, but with the way your data are structured I doubt there's a substantially better alternative.

Sign up to request clarification or add additional context in comments.

1 Comment

Okay, yeah, this makes sense. I'll give this a shot and hopefully performance won't be an issue. Thanks!
1

You dont need the subquery, just a plain inner join will do (this assumes there are no duplicate effective_times per customer):

SELECT o.customer_id, o.quantity
    ,p.price
FROM orders o
JOIN prices p ON p.customer_id = o.customer_id
              AND p.effective_time <= o.timestamp
              AND NOT EXISTS ( SELECT * FROM prices nx
                  WHERE nx.customer_id = o.customer_id
                  AND nx.effective_time <= o.timestamp
                  AND nx.effective_time > p.effective_time
              )
    ;

2 Comments

Okay... interesting. That could work. It still has a subquery though -- do you have any idea what the performance would be like related to @JohnBollinger 's answer above?
Performance is probably the same. Only a subquery using rownumber() over (partition by customer_id ORDER BY effective_time DESC) could possibly perform better.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.