Window Functions in SQL

The SQL window function is a crucial feature for everyone because it makes it possible to calculate a specific set of rows while keeping the rows in the dataset intact. This is referred to as “windows”. These help in advanced data analytics and database management.

In this article, we will examine the definition of SQL window functions, their classifications, and real-life examples with reference to how they work.

Table of Contents

What are Window Functions in SQL Server?
Types of Window Functions
Use Cases of Window Functions with Examples
Performance Considerations of Window Functions
Conclusion

What are Window Functions in SQL Server?

SQL window functions let you compute values over a set of records associated with the current record without grouping them. Window functions are different from aggregate functions such as SUM() or AVG() because window functions do not change the total number of rows in the data, rather, they provide more insight into a single record from the data.

Window functions could calculate the running total, rank, or moving average with the full dataset intact.

Types of Window Functions

Window functions in SQL can be categorized into the following three types:

1. Ranking Functions

In SQL, ranking functions are used to assign a position to specific rows in a given order. They provide an order to the dataset without deleting any rows.

For the implementation, we are going to create an employee table with columns employee_id, name, department and salary

CREATE TABLE employees (

    employee_id INT PRIMARY KEY,

    name VARCHAR(100),

    department VARCHAR(50),

    salary DECIMAL(10,2)

);

INSERT INTO employees (employee_id, name, department, salary) VALUES
(1, 'Alice', 'HR', 60000),
(2, 'Bob', 'HR', 75000),
(3, 'Charlie', 'HR', 70000),
(4, 'David', 'Engineering', 90000),
(5, 'Eve', 'Engineering', 85000),
(6, 'Frank', 'Engineering', 95000),
(7, 'Grace', 'Marketing', 72000),
(8, 'Hank', 'Marketing', 78000),
(9, 'Ivy', 'Marketing', 75000);

1. ROW_NUMBER()

This function gives a unique value to each row in a specified order without any matching values. In this case, there are no ties and every row receives a different number, such as 1, 2, 3 and so forth.

SELECT employee_id, department, salary,
       ROW_NUMBER() OVER (PARTITION BY department ORDER BY salary DESC) AS row_num
FROM employees;

2. RANK()

This function uniquely ranks each row within a specified order. It also skips numbers in its ranking if there are multiple ties. For example, if there are two rows sharing the same rank value of 2, then the next row would be assigned rank 4.

 SELECT employee_id, department, salary,
       RANK() OVER (PARTITION BY department ORDER BY salary DESC) AS rank
FROM employees;

3. DENSE_RANK()

This function can be viewed as a variation of the RANK function. It does not skip ranks for duplicate values. For example, if there are two rows sharing the same rank of 2, the next row will receive a rank value of 3.

SELECT employee_id, department, salary,
       DENSE_RANK() OVER (PARTITION BY department ORDER BY salary DESC) AS dense_rank
FROM employees;

4. NTILE()

This function partitions the data into nearly equal sections (as close as possible) after dividing it into n sections and assigns a group number to each row. For example, if the data is cut into 4 sections, every row shall be assigned a number from 1 to 4.

SELECT employee_id, department, salary,
       NTILE(4) OVER (PARTITION BY department ORDER BY salary DESC) AS quartile

2. Aggregate Functions

In SQL, aggregate functions are unique tools that summarize values from multiple rows into a single result. Aggregate functions work behind the scenes to provide you with totals, averages, counts, etc.

1. SUM()

It calculates the total value of all items in a column. Mostly used when a total has to be computed. For instance, calculate the total revenue of a company by summing all sales.

SELECT employee_id, department, salary,

       SUM(salary) OVER (PARTITION BY department ORDER BY salary DESC) AS running_total

FROM employees;

2. AVG()

It finds the average value of items in a column. Effective when trying to understand patterns. For instance, find the average salary of employees in a department.

SELECT employee_id, department, salary,

       AVG(salary) OVER (PARTITION BY department) AS average_salary

FROM employees;

3. MIN() & MAX()

The following is the explanation of the min() and ma x () functions:

MIN() – It gives the smallest value (least amount) found in the column. Useful in determining the smallest value. For instance, get the lowest product price in an inventory.

MAX() – It gives the largest value (most extreme) found in the column. Useful for determining the most valuable assets. For instance, find the highest marks scored in a class.

SELECT employee_id, department, salary,

       MIN(salary) OVER (PARTITION BY department) AS min_salary,

       MAX(salary) OVER (PARTITION BY department) AS max_salary

FROM employees;

3. Analytic Functions

Analytic functions in SQL are parallel to window functions. They perform calculations in a dataset on a defined set of rows, however, they do not alter the data as presented. These functions let one see data in a more intricate multivariate manner while keeping the dataset intact.

1. LEAD()

It lets you access the next row’s value in a column. Useful for comparing the current row with the next one.

Example: Compare the sales of today with tomorrow’s sales.

SELECT employee_id, department, salary,
       LEAD(salary, 1) OVER (PARTITION BY department ORDER BY salary DESC) AS next_salary
FROM employees;

2. LAG()

It lets you access the previous row’s value in a column. It is helpful for comparing the current row with the one before it. Example: Compare today’s sales with yesterday’s sales.

SELECT employee_id, department, salary,
       LAG(salary, 1) OVER (PARTITION BY department ORDER BY salary DESC) AS previous_salary
FROM employees;

3. FIRST_VALUE()

It returns the first value in a specified window of rows. Example: Get the first product price from a sorted list.

4. LAST_VALUE()

It returns the last value in a specified window of rows. Example: Get the last order date from a sorted list of orders.

SELECT employee_id, department, salary,
       FIRST_VALUE(salary) OVER (PARTITION BY department ORDER BY salary DESC) AS highest_salary,
       LAST_VALUE(salary) OVER (
    PARTITION BY department 
    ORDER BY salary DESC 
    ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
) AS lowest_salary

Use Cases of Window Functions with Examples

1. Calculating Running Totals

Window functions assist you in the tracking of running totals where each row represents the total amount from the start until that row.

SELECT employee_id, sales_date, sales_amount,

       SUM(sales_amount) OVER (PARTITION BY employee_id ORDER BY sales_date) AS running_total

FROM sales;

2. Ranking Top Performers

With window functions, you are able to track the top n data entries like top sales or top employees without the underlying data getting lost.

SELECT employee_id, department, performance_score,

       RANK() OVER (PARTITION BY department ORDER BY performance_score DESC) AS rank

FROM employee_performance

WHERE rank <= 3;

3. Calculating Moving Averages

You are able to track Moving averages to get rid of short-term fluctuations and get an idea of underlying trends.

SELECT sales_date, sales_amount,

       AVG(sales_amount) OVER (ORDER BY sales_date ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) AS moving_avg

FROM sales;

4. Comparing Current and Previous Sales

You may track the current row value against a previous row value instantly, as one can track today’s sales or yesterday’s.

SELECT sales_date, sales_amount,

       LAG(sales_amount, 1) OVER (ORDER BY sales_date) AS previous_sales,

       (sales_amount - LAG(sales_amount, 1) OVER (ORDER BY sales_date)) AS sales_growth

FROM sales;

5. Identifying First and Last Transactions

Window functions help you track the first value or last value in the series, like the first transaction in a series of transactions or the last one in the series.

SELECT customer_id, transaction_date, amount,

       FIRST_VALUE(transaction_date) OVER (PARTITION BY customer_id ORDER BY transaction_date) AS first_transaction,

       LAST_VALUE(transaction_date) OVER (PARTITION BY customer_id ORDER BY transaction_date ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS last_transaction

FROM transactions;

Performance Considerations of Window Functions

Like aggregation functions, window functions are often not optimized to their full potential. For example, you have to make sure that the column that is being used in PARTITION BY is indexed for fast access. If it’s not indexed, the system will have to go through each row, and this ends up making querying too slow. On top of that, performance may also be affected by sorting (ORDER BY), especially if the dataset is very large. Processing time can also increase when there are multiple window functions in one single query. A way to optimize these functions is to restrict the number of rows processed by applying ROWS BETWEEN instead of checking the entire dataset. If the queries are still slow, break them down into simpler pieces or use materialized views to run queries faster.

Conclusion

SQL makes use of window functions to help keep track of complex computations without losing the details of each individual record. It helps to perform complex tasks such as data ranking, row-wise running total, and inter-row comparisons with ease. These functions, ROW_NUMBER, RANK, LEAD, and LAG, make a powerful enhancement to your analysis. Thus, knowing how to use window functions extends your capacity to work with SQL and analyze data in more intelligent ways.

Dive into the world of advanced SQL queries and functions through the detailed articles provided.-

Multiple Rows to Comma-Separated Value – This tutorial covers the technique of converting Multiple Rows to Comma-Separated Values in SQL.

Convert SQL to LINQ Query – Here, you will discover how to translate SQL commands into LINQ queries effectively.

SQL Command Categories – This article explains the classification of SQL commands and their key purposes.

Database Design in DBMS – Explore the essentials of database design principles in DBMS through this guide.

SQL Server Transactions – This discussion focuses on the implementation and management of SQL Server transactions.

CTE in SQL – This resource breaks down the use and benefits of CTEs (Common Table Expressions) in SQL.

Pivot and Unpivot in SQL – This piece illustrates how to utilize pivot and unpivot SQL operations efficiently.

SQL Subqueries – Understand the functionality of SQL subqueries and their practical uses in this article.

Update from SELECT – This guide demonstrates how to perform updates using SELECT statements in SQL.