Common Table Expression (CTE) in SQL

In SQL Server, Common Table Expressions (CTEs) allow us to make complex SQL queries simple to write and understand. It basically creates a temporary group of results that you can use in statements like SELECT, INSERT, UPDATE and DELETE. This helps you make your SQL code cleaner and more organized, allowing you to manage it easily.

In this article, we are going to learn what CTEs are and how they work. We will see the implementation of CTEs with the help of SQL Server examples.

Table of Contents

What is a CTE?

A Common Table Expression (CTE) is like having a temporary table that will be created only when you are running statements like SELECT, INSERT, UPDATE or DELETE. You can give it a name of your choice and use it within the same query. CTEs can also call themselves, which are also known as recursive CTEs. It can be used when you are working with data like family trees or company structures.

Syntax for CTE

WITH CTE_Name AS (
    -- SQL query to define the CTE
    SELECT column1, column2
    FROM TableName
    WHERE condition
)

SELECT * FROM CTE_Name;

In the above syntax, CTE_Name is the name of the temporary table. Inside the parentheses, you can write the queries to select data from the table.

Why Use CTEs in SQL Server?

  1. Improved Readability: CTEs divide the big queries into small parts that make it easier to understand.
  2. Reusable Code: You can use the same CTE multiple times whenever required in your query.
  3. Recursion: CTEs can call themselves, which can be very useful when working with data that is hierarchical or tree in structure.
  4. Performance: It can make queries run a bit faster as it tries to keep them simple.

CTE for Retrieving Data

Let’s take a simple example where we are using a table named Intellipaat_Employees. Let’s say that you want to extract all the rows where the employee’s department is HR.

Example 1: Basic CTE

CREATE TABLE Intellipaat_Employees (
    EmployeeID INT PRIMARY KEY,
    FirstName NVARCHAR(50),
    LastName NVARCHAR(50),
    Department NVARCHAR(50)
);
INSERT INTO Intellipaat_Employees (EmployeeID, FirstName, LastName, Department)
VALUES (1, 'John', 'Doe', 'HR'),
       (2, 'Jane', 'Smith', 'IT'),
       (3, 'Samuel', 'Jackson', 'HR');

WITH HR_Employees AS (
    SELECT EmployeeID, FirstName, LastName, Department
    FROM Intellipaat_Employees
    WHERE Department = 'HR'
)
SELECT * FROM HR_Employees;
Basic CTE
Basic CTE Image

Explanation:

  • Then the main query retrieves the output from the CTE.
  • We designed a table, Intellipaat_Employees for storing employees information.
  • HR_Employees is a CTE that pulls resourced HR employees.

CTEs vs Subqueries

Both subqueries and CTEs are utilized to simplify a long and complicated query. Still, CTEs are much easier to read and use it later also. Let’s see each method:

Example 2: Using a Subquery

SELECT EmployeeID, FirstName, LastName, Department

FROM Intellipaat_Employees

where department=(select department from Intellipaat_Employees where Department='IT')
Using a Subquery Image

Here, we access Intellipaat_Employees directly without using a CTE, and while this is perfectly functional, it has the downside of being quite bulky during especially rigid queries.

CTEs do have their advantages when dealing with several tables or performing rigid query logic; it’s convenient for breaking up complex queries.

Recursive CTEs

CTEs’ ability to handle recursion is one of its most extraordinary powers. Hierarchical data, such as employees and superiors are managed using recursive CTEs.

We will take a sample to build a recursive CTE that gets the whole organization hierarchy. The sample will take the Intellipaat_Employees table, assuming that a ManagerID means the EmployeeID of the respective manager.

Example 3: Recursive CTE for Organizational Hierarchy

-- Modify the Intellipaat_Employees table to include a ManagerID field

ALTER TABLE Intellipaat_Employees ADD ManagerID INT;

-- Sample data with ManagerID

UPDATE Intellipaat_Employees

SET ManagerID = NULL

WHERE EmployeeID = 1;  -- CEO

UPDATE Intellipaat_Employees

SET ManagerID = 1

WHERE EmployeeID = 2;  -- Manager of IT

UPDATE Intellipaat_Employees

SET ManagerID = 1

WHERE EmployeeID = 3;  -- Manager of HR
Recursive CTE for Organizational Hierarchy
Recursive CTE for Organizational Hierarchy Image
Recursive CTE Image
-- Recursive CTE to retrieve the hierarchy
WITH RECURSIVE OrgHierarchy AS (
    -- Base case: top-level manager (CEO)
    SELECT EmployeeID, FirstName, LastName, ManagerID
    FROM Intellipaat_Employees
    WHERE ManagerID IS NULL
    
    UNION ALL
    
    -- Recursive case: find employees managed by the ones already in OrgHierarchy
    SELECT e.EmployeeID, e.FirstName, e.LastName, e.ManagerID
    FROM Intellipaat_Employees e
    INNER JOIN OrgHierarchy o ON e.ManagerID = o.EmployeeID
)
SELECT * FROM OrgHierarchy;
Recursive CTE

Explanation:

  • We added a ManagerID to Intellipaat_Employees to show which employee manages whom.
  • The recursive CTE will start with the highest-ranking executive (CEO) and continue with their subordinates, recursively finding employees under each superior.
  • Therefore, this query displays the whole structure of an organization’s hierarchy, beginning with the peak, which is the CEO.

Practical Use Cases for CTEs

1. Breaking Down Complex Joins

Imagine having an Intellipaat_Sales table and a Products table, now you want to get the overall sales for every product within the West region.

-- Create the Intellipaat_Sales table
CREATE TABLE Intellipaat_Sales (
    SaleID INT PRIMARY KEY,
    ProductID INT,
    Region NVARCHAR(50),
    SalesAmount DECIMAL(10, 2)
);
Breaking Down Complex Joins
-- Sample data for sales
INSERT INTO Intellipaat_Sales (SaleID, ProductID, Region, SalesAmount)
VALUES (1, 101, 'West', 500.00),
       (2, 102, 'East', 300.00),
       (3, 101, 'West', 200.00);
Sample data for sales
-- Define a CTE for total sales in the West region
WITH RegionSales AS (
    SELECT ProductID, SUM(SalesAmount) AS TotalSales
    FROM Intellipaat_Sales
    WHERE Region = 'West'
    GROUP BY ProductID
)
Select * from RegionSales
Define a CTE for total sales in the West region output
-- Main query: Join CTE with Products table
SELECT p.ProductName, rs.TotalSales
FROM Intellipaat_Products p
JOIN RegionSales rs ON p.ProductID = rs.ProductID;
Join CTE with Products table

Explanation:

  • For every product in the West region, sales are calculated in the RegionSales CTE.
  • Then, the main query joins this result set with the Intellipaat_Products table so that, in addition to the sales figure, the product name is also shown.

2. Pagination with Row Numbers

CTEs and the ROW_NUMBER() function make pagination easier in SQL Server. The ROW_NUMBER() function along with CTEs, makes paginating results straightforward.

-- Define a CTE with row numbers

WITH ProductPage AS (

    SELECT ProductID, ProductName, ROW_NUMBER() OVER (ORDER BY ProductName) AS RowNum

    FROM Intellipaat_Products

)
Define a CTE with row numbers
-- Main query: Fetch records for page 2 (10 records per page)
SELECT ProductID, ProductName
FROM ProductPage
WHERE RowNum BETWEEN 11 AND 20;  -- Fetch records for page 2
Fetch records for page

Explanation:

  • The ROW_NUMBER() function assigns a unique number to each row.
  • The main query retrieves a specific “page” of data, for example, the second page (records 11-20).

Performance Considerations

In SQL Server, CTEs are not kept in memory, which means they are recalculated whenever they’re referenced. This can have a negative effect on performance when working with large datasets or when executing recursive queries. If a CTE is frequently referenced within a query, it is advisable to save the output in a temporary table for optimal performance. As for recursion in CTEs, SQL Server has a default recursion limit of 100 levels. To increase that limit, use the following command when deeper recursion is necessary:

OPTION (MAXRECURSION 200);

This increases the limit to 200 levels, which can be useful, but this option should be used correctly to prevent infinite loops and high CPU utilization.

Best Practices for Using CTEs

  1. Use Descriptive CTE Names: Always use meaningful descriptive CTE names. Examples include SalesSummary, OrgHierarchy or EmployeeCTE.
  2. Control SQL Server Recursion: SQL Server has a default recursion depth limit of 100. To control this, the OPTION (MAXRECURSION) clause allows for limiting the recursive depth.
  3. Use Caution: CTEs are powerful pieces of technology, but using them too much in large queries can result in performance issues and make the query more complex, therefore less maintainable.
  4. Deny Temporary Tables: Light in weight and more effective, CTEs are more efficient than creating temporary tables.

Conclusion

Common Table Expressions (CTEs) in SQL Server help you write cleaner and easier-to-read queries. They make complex queries simpler, especially when dealing with repeated calculations, hierarchical data, or multiple joins. CTEs also improve query organization and make debugging easier.

Start using CTEs to make your SQL code more efficient and understandable!

About the Author

Data Engineer

As a skilled Data Engineer, Sahil excels in SQL, NoSQL databases, Business Intelligence, and database management. He has contributed immensely to projects at companies like Bajaj and Tata. With a strong expertise in data engineering, he has architected numerous solutions for data pipelines, analytics, and software integration, driving insights and innovation.

business intelligence professional