How to Define a Window Frame in SQL Window Functions
The window function is a super powerful resource of the SQL language. At the core of any window function, there is a set of records, called the window frame, defined using an OVER clause. Knowing what records are in the window frame, how they are ordered, and what their upper and lower bounds are, are critical in understanding how window functions work. In this article, we will analyze and explain with examples how you can define different types of window frames. Read on to take an important step in the growth of your SQL skills!
Using PARTITION BY to Define a Window Frame
SQL window functions perform calculations based on a set of records. For example, you might want to calculate the average salary of a specific group of employee records. This group of records is called the window frame, and its definition is central to understanding how window functions work and how we can take advantage of them.
The window frame is a set of rows related to the current row where the window function is used for calculation. The window frame can be a different set of rows for the next row in the query result, since it depends on the current row being processed. Every row in the result set of the query has its own window frame.
In the rest of this article, we will show example queries based on a database of a car dealership group. The group stores the sales information grouped by month in a table called monthly_car_sales. Below is the table with some sample data:
monthly_car_sales
| year | month | make | model | type | quantity | revenue |
|---|---|---|---|---|---|---|
| 2021 | 01 | Ford | F100 | PickUp | 40 | 2500000 |
| 2021 | 01 | Ford | Mustang | Car | 9 | 1010000 |
| 2021 | 01 | Renault | Fuego | Car | 20 | 9000000 |
| 2021 | 02 | Renault | Fuego | Car | 50 | 23000000 |
| 2021 | 02 | Ford | F100 | PickUp | 20 | 1200000 |
| 2021 | 02 | Ford | Mustang | Car | 10 | 1050000 |
| 2021 | 03 | Renault | Megane | Car | 50 | 20000000 |
| 2021 | 03 | Renault | Koleos | Car | 15 | 1004000 |
| 2021 | 03 | Ford | Mustang | Car | 20 | 2080000 |
| 2021 | 04 | Renault | Megane | Car | 50 | 20000000 |
| 2021 | 04 | Renault | Koleos | Car | 15 | 1004000 |
| 2021 | 04 | Ford | Mustang | Car | 25 | 2520000 |
A simple way to create a window frame is by using an OVER clause with a PARTITION BY subclause. In the following SQL example, we generate a report of revenue by make of the car for the year 2021.
SELECT make,
SUM(revenue) OVER (PARTITION BY make) AS total_revenue
FROM monthly_car_sales
WHERE year = 2021
Below, the window frames generated by the previous query are shown in different colors (red for Ford and blue for Renault). All records with the same value in the make column (the rows as color-coded below) belong to the same window frame. Since we have only two different values in the make column, we have two window frames.
| year | month | make | model | type | quantity | revenue |
|---|---|---|---|---|---|---|
| 2021 | 01 | Ford | F100 | PickUp | 40 | 2500000 |
| 2021 | 01 | Ford | Mustang | Car | 9 | 1010000 |
| 2021 | 01 | Renault | Fuego | Car | 20 | 9000000 |
| 2021 | 02 | Renault | Fuego | Car | 50 | 23000000 |
| 2021 | 02 | Ford | F100 | PickUp | 20 | 1200000 |
| 2021 | 02 | Ford | Mustang | Car | 10 | 1050000 |
| 2021 | 03 | Renault | Megane | Car | 50 | 20000000 |
| 2021 | 03 | Renault | Koleos | Car | 15 | 1004000 |
| 2021 | 03 | Ford | Mustang | Car | 20 | 2080000 |
| 2021 | 04 | Renault | Megane | Car | 40 | 15000000 |
| 2021 | 04 | Renault | Koleos | Car | 20 | 1504000 |
| 2021 | 04 | Ford | Mustang | Car | 25 | 2520000 |
The result of the query is:
| make | total_revenue |
|---|---|
| Ford | 10360000 |
| Renault | 69508000 |
I would like to suggest 2 articles where you can find a lot of introductory information about SQL window functions: “SQL Course of The Month - Window Functions” and “When Do I Use SQL Window Functions?”.
Ordering the Rows Within a Window Frame With ORDER BY
In addition to PARTITION BY, we can use an ORDER BY subclause to order the rows inside a window frame. Having the window frame ordered by some criteria allows us to use analytic window functions like LEAD(), LAG(), and FIRST_VALUE(), among others.
For example, if we want to obtain the revenue difference between consecutive months, we can order the window frame by month. Then, given any current row, the LAG() window function can return any column of the previous month. Let’s see an example that obtains the revenue difference between each pair of consecutive months.
SELECT make,
model,
month,
revenue AS current_month_revenue,
LAG(revenue) OVER ( ORDER BY month) AS previous_month_revenue,
revenue - LAG (revenue) OVER (ORDER BY month) AS delta_revenue
FROM monthly_car_sales
WHERE year = 2021
AND model = 'Mustang'
The result of the previous query is:
| make | Model | Month | Current Month Revenue | Previous Month Revenue | Delta Revenue |
|---|---|---|---|---|---|
| Ford | Mustang | 1 | 1010000 | NULL | NULL |
| Ford | Mustang | 2 | 1050000 | 1010000 | 4000 |
| Ford | Mustang | 3 | 2080000 | 1050000 | 103000 |
| Ford | Mustang | 4 | 2520000 | 2080000 | 440000 |
The first row in the result has NULL values in previous_month_revenue and delta_revenue columns. This is because there is no previous month for January. The column delta_revenue is calculated between the current month’s revenue and the previous month’s revenue, the latter of which is obtained with the LAG() window function.
When we use ORDER BY in an OVER clause, there is a new element to consider: the bounds of the window frame. If we don’t specify any ORDER BY clause, the entire partition becomes the window frame. However, when we use an ORDER BY subclause, the current row becomes the upper bound of the window frame. In other words, the rows following the current row (based on the ORDER BY criteria) are not included in the window frame. In the next section, we will cover the concept of window frame bounds in detail.
Defining Window Frame Bounds With ROWS
A very interesting feature of the OVER clause is the ability to specify the upper and lower bounds of a window frame. These bounds can be specified by using one of the two subclauses in the OVER clause: ROWS or RANGE. In this section, we will explain how to use the ROWS subclause of the OVER clause.
The window frame is a set of rows that are somehow related to the current row. Their bounds can be defined for each row in the query result with a ROWS subclause, which has the following syntax:
ROWS BETWEEN lower_bound AND upper_bound
As we already mentioned in the previous section, it is important to know what the default bounds of the window frame are. When we specify an ORDER BY subclause, the current row is the upper bound of the window frame by default. However, in some cases, we need to change this upper bound (or the lower bound) as we will see below.
Let’s see an example where we need to specify the bounds of a window frame. Suppose we want a report with the total sales for the current month, the total sales for the previous month, and the maximum sales in any individual month throughout the year, all by make and model. The query to obtain such a report is:
SELECT make,
model,
month,
revenue AS current_month_revenue,
LAG(revenue) OVER (PARTITION BY make, model ORDER BY month) AS
prev_month,
MAX(revenue) OVER (PARTITION BY make, model ORDER BY month ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS max_year_revenue
FROM monthly_car_sales
WHERE year = 2021
| make | model | month | current_month_revenue | prev_month | max_year_revenue |
|---|---|---|---|---|---|
| Ford | F100 | 1 | 2500000 | NULL | 2500000 |
| Ford | F100 | 2 | 1200000 | 2500000 | 2500000 |
| Ford | Mustang | 1 | 1010000 | NULL | 2520000 |
| Ford | Mustang | 2 | 1050000 | 1010000 | 2520000 |
| Ford | Mustang | 3 | 2080000 | 1050000 | 2520000 |
| Ford | Mustang | 4 | 2520000 | 2080000 | 2520000 |
| Renault | Fuego | 1 | 9000000 | NULL | 23000000 |
| Renault | Fuego | 2 | 23000000 | 9000000 | 23000000 |
| Renault | Koleos | 3 | 1004000 | NULL | 1504000 |
| Renault | Koleos | 4 | 1504000 | 1004000 | 1504000 |
| Renault | Megane | 3 | 20000000 | NULL | 20000000 |
| Renault | Megane | 4 | 15000000 | 20000000 | 20000000 |
Had we omitted the ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING subclause in the MAX() window function in the previous query, we would have obtained the maximum between the first month and the current month. This is wrong, since we want the maximum monthly revenue considering the entire year (including the months after the current month). So, we need to include all available months in the table. We do that by adding the subclause ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING to the OVER clause.
The next image shows all the possible bounds we can specify to define the lower and the upper bounds of a window frame:
The options for lower and upper bounds in the OVER clause are:
UNBOUNDED PRECEDINGN PRECEDINGCURRENT ROWN FOLLOWINGUNBOUNDED FOLLOWING
We need to make sure the lower bound is less than the upper bound.
Finally, I would like to suggest the article Why Learn SQL Window Functions in 2021? where you can learn how to use window functions.
Defining Window Frame Bounds With RANGE
In the previous section, we defined the bounds of a window frame in terms of ROWS. In this section, we will explain how to use the subclause RANGE to specify the bounds of a window frame in ranges of rows. The syntax of the RANGE subclause is the following:
RANGE BETWEEN lower_bound AND upper_bound
A range is a set of rows with the same value for the PARTITION BY criteria. For example, if we have a PARTITION BY month, we can see the difference in the next image when we use ROWS or RANGE to define a window frame:
OVER ( PARTITION BY …... | OVER ( PARTITION BY ….. |
If we want a revenue report by make for the current month and for each of the last three months, we can use the following query:
SELECT make,
model,
month,
revenue AS model_revenue_current_month,
SUM(revenue) OVER ( PARTITION BY make
ORDER BY month
RANGE BETWEEN 0 PRECEDING AND CURRENT ROW
) AS make_current_month,
SUM(revenue) OVER (PARTITION BY make
ORDER BY month
RANGE BETWEEN 1 PRECEDING AND CURRENT ROW
) AS make_last_2_months,
SUM(revenue) OVER (PARTITION BY make
ORDER BY month
RANGE BETWEEN 2 PRECEDING AND CURRENT ROW
) AS make_last_3_months
FROM monthly_car_sales
WHERE year = 2021
ORDER BY 1,3,2
The previous query uses the RANGE subclause to specify a window frame with all the records of the current make for a range of N months. For example:
SUM(revenue) OVER (
PARTITION BY make
ORDER BY month
RANGE BETWEEN 1 PRECEDING AND CURRENT ROW
) AS make_last_2_months
The previous subclause RANGE BETWEEN 1 PRECEDING AND CURRENT ROW specifies a window frame that includes the preceding month and the current month. Then, the SUM() function will return the total revenue in the last two months.
Similarly, we can use the following OVER clause to obtain the total revenue in the last three months.
SUM(revenue) OVER (
PARTITION BY make
ORDER BY month
RANGE BETWEEN 2 PRECEDING AND CURRENT ROW
) AS make_last_3_months
There are several abbreviations available to make the syntax easier for these bounding clauses:
| Abbreviation | Complete Syntax |
|---|---|
| UNBOUNDED PRECEDING | BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW |
| n PRECEDING | BETWEEN n PRECEDING AND CURRENT ROW |
| CURRENT ROW | BETWEEN CURRENT ROW AND CURRENT ROW |
| n FOLLOWING | BETWEEN AND CURRENT ROW AND n FOLLOWING |
| UNBOUNDED FOLLOWING | BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING |
Last, but not least, here is my preferred cheat sheet on window functions with a lot of syntax details: SQL window functions cheat sheet.
Take Advantage of Window Functions!
A central idea in the window function is the window frame, that is, the group of records over which the window function works. In this article, we have explained that the window frame depends on the current row and is defined by the OVER clause. We have also shown several examples for defining which records to include in the window frame, ordering the rows within it, and defining its bounds.
For those who want to go deeper, I suggest the Window Functions SQL online course, with plenty of examples using different window functions. Develop your skills and increase your assets!!
