Hierarchical Recursive Query in MySQL

Recursive queries are necessary for hierarchical data structures like charts, trees, and file directories to effectively navigate parent-child relationships in MySQL. Hierarchical queries help in retrieving data that contains multiple levels of relationships. In this blog, you will explore the different methods for creating a hierarchical recursive query in MySQL in detail with examples for each.

Table of Contents:

What is Hierarchical Recursive Query in MySQL?

Hierarchical Recursive Query in MySQL provides the ability to query data that is structured in a tree-like parent-child relationship. MySQL can support this kind of query through the WITH RECURSIVE Common Table Expression (CTE). It starts with a base case (anchor member) and recursively applies the recursive step. Recursion will stop when there are no more child records to fetch.

Why do we need to create a Hierarchical Recursive Query in MySQL?

Hierarchical Recursive Querying in MySQL is important for handling parent-child relationships like organizational charts and file systems so that traversal of multi-level relationships is performed efficiently, which is difficult with standard SQL joins. Getting, for example, nested categories in e-commerce, browsing the folder structure, and employees under a manager are examples of typical use cases.

Master Advanced SQL Skills Today!
Enroll now and transform your future!
quiz-icon

Advantages of creating a Hierarchical Recursive Query in MySQL 

  • Simplifies Complex Queries: Improved readability and optimization occur in these complex queries by removing the need for a large number of self-joins.
  • Less Redundancy: These queries are less cumbersome and easier to maintain because they do not include manual recursive lookups from application logic. 
  • Increased Performance: Sequential querying is no longer required since this hierarchical data structure can be retrieved in a single query rather than in several queries.

Before exploring the methods, let us create a table called Folders and insert some values into it that can be used as an example for the following methods

CREATE TABLE folders (
    id INT PRIMARY KEY,
    folder_name VARCHAR(100),
    parent_id INT NULL,         
    path VARCHAR(255) UNIQUE,   
    lft INT NOT NULL,           
    rgt INT NOT NULL            
);

-- Inserting the values into the table
INSERT INTO folders (id, folder_name, parent_id, path, lft, rgt) VALUES
(1, 'Root', NULL, '/', 1, 14),
(2, 'Documents', 1, '/Documents', 2, 7),
(3, 'Images', 2, '/Documents/Images', 3, 4),
(4, 'Videos', 2, '/Documents/Videos', 5, 6),
(5, 'Downloads', 1, '/Downloads', 8, 13),
(6, 'Software', 5, '/Downloads/Software', 9, 10),
(7, 'Music', 5, '/Downloads/Music', 11, 12);

-- To display the table 
SELECT * FROM folders;
Folders table Recursive query

This is how the table looks once it is created and inserted with some values.

Methods to Create a Hierarchical Recursive Query in MySQL

There are several ways to create a Hierarchical Query in MySQL, and they can be appropriate for different circumstances. A Recursive Common Table Expression is one of the most efficient methods we can use to create a Hierarchical Query.

Method 1: Using Recursive Common Table Expression in MySQL

A Recursive CTE is used to query hierarchical data by repeatedly joining the table with itself until the full hierarchy is retrieved. 

Syntax:

WITH RECURSIVE cte_name AS (
    -- Base Case (Starting Point)
    SELECT column1, column2, ..., 1 AS level
    FROM table_name
    WHERE condition_for_root_nodes
    UNION ALL
    -- Recursive Case (Joins Table with Itself)
    SELECT t.column1, t.column2, ..., cte.level + 1
    FROM table_name t
    INNER JOIN cte_name cte ON t.parent_column = cte.id
)
SELECT * FROM cte_name;

Example:

WITH RECURSIVE folder_hierarchy AS (
    -- Base Case: Start from the Root folder
    SELECT id, folder_name, parent_id, path, 1 AS level
    FROM folders
    WHERE parent_id IS NULL
    UNION ALL
    -- Recursive Case: Fetch child folders
    SELECT f.id, f.folder_name, f.parent_id, f.path, fh.level + 1
    FROM folders f
    INNER JOIN folder_hierarchy fh ON f.parent_id = fh.id
)
SELECT * FROM folder_hierarchy;

Output:

Using Recursive Common Table Expression in MySQL

Explanation: Here, the SELECT query selects the (parent_id IS NULL) as a base class and assigns it as level = 1. Then, the recursive part joins the Folders table with the CTE itself.

Method 2: Using the Adjacency List Model in MySQL

This Adjacency List Model represents the hierarchical data using a self-referencing foreign key in which each row stores a reference (parent_id) to its direct parent.

Syntax:

SELECT parent.* FROM table_name AS child
JOIN table_name AS parent ON child.parent_id = parent.id
WHERE child.id = ?;

Example:

-- Query to find the Parent of the given folder 
SELECT parent.*
FROM folders AS child
JOIN folders AS parent ON child.parent_id = parent.id
WHERE child.id = 3; 

Output:

Using the Adjacency List Model in MySQL

Explanation: Here, the JOIN Statement joins the table to itself using parent_id, retrieving the parent folder of a given child by matching the child.parent_id = parent.id. The WHERE Clause filters for a specific child (id=3), returning its parent details

Method 3: Using the Nested Set Model in MySQL

The Nested Set Model represents the hierarchical data using left and right values, which makes the retrieval process more efficient.

Syntax:

SELECT child.column_name, (COUNT(parent.column_name) - 1) AS depth
FROM table_name AS child
JOIN table_name AS parent 
    ON child.lft BETWEEN parent.lft AND parent.rgt
WHERE parent.column_name = 'randomvalue'
GROUP BY child.id
ORDER BY child.lft;

Example:

SELECT child.folder_name, child.lft, child.rgt, 
       (COUNT(parent.id) - 1) AS depth
FROM folders AS child
JOIN folders AS parent 
    ON child.lft BETWEEN parent.lft AND parent.rgt
WHERE parent.folder_name = 'Documents'
GROUP BY child.id
ORDER BY child.lft;

Output:

Using the Nested Set Model in MySQL

Explanation: To identify the hierarchical relationships, the JOIN Statement joins the table to itself using lft and rgt. The COUNT() statement calculates the depth and filters the result based on the specific parent node.

Method 4: Using the Materialized Path Model in MySQL

The Materialized Path represents the hierarchical data by storing the entire path of a node as a delimited string. 

Syntax:

SELECT column_name1, column_name2, 
       LENGTH(path_column) - LENGTH(REPLACE(path_column, 'delimiter', '')) AS depth
FROM table_name
ORDER BY path_column;

Example:

SELECT folder_name, path, 
       LENGTH(path) - LENGTH(REPLACE(path, '/', '')) AS depth
FROM folders
WHERE path LIKE '/Documents/%'
ORDER BY path;

Output:

Using the Materialized Path Model in MySQL

Explanation: The WHERE clause filters the rows whose path starts with /Documents/, which ensures that only descendants of ‘Documents’ are selected.

Get 100% Hike!

Master Most in Demand Skills Now!

Method 5: Using the Closure Table Model in MySQL

The Closure Table Model represents the hierarchical relationships in a separate table by maintaining all the ancestor-descendant pairs.

Syntax:

CREATE TABLE table_name (
    id INT PRIMARY KEY,      
    name VARCHAR(100)         
);

CREATE TABLE table_closure (
    ancestor INT,             
    descendant INT,           
    depth INT,                
    FOREIGN KEY (ancestor) REFERENCES table_name(id),
    FOREIGN KEY (descendant) REFERENCES table_name(id)
);

Example:

Let us use the following table as an example for a better understanding.

CREATE TABLE folders (
    id INT PRIMARY KEY,
    folder_name VARCHAR(100)
);

CREATE TABLE folder_closure (
    ancestor INT,
    descendant INT,
    depth INT,
    PRIMARY KEY (ancestor, descendant),
    FOREIGN KEY (ancestor) REFERENCES folders(id),
    FOREIGN KEY (descendant) REFERENCES folders(id)
);

-- Insert folders
INSERT INTO folders (id, folder_name) VALUES
(1, 'Root'),
(2, 'Documents'),
(3, 'Images'),
(4, 'Videos'),
(5, 'Downloads'),
(6, 'Software'),
(7, 'Music');

-- Insert ancestor-descendant relationships
INSERT INTO folder_closure (ancestor, descendant, depth) VALUES
(1, 1, 0), (2, 2, 0), (3, 3, 0), (4, 4, 0), (5, 5, 0), (6, 6, 0), (7, 7, 0),  
(1, 2, 1), (1, 3, 2), (1, 4, 2), (1, 5, 1), (1, 6, 2), (1, 7, 2),  
(2, 3, 1), (2, 4, 1),  
(5, 6, 1), (5, 7, 1);  

-- Query to retrieve all the descendants of documents (id=2)
SELECT f.folder_name, fc.depth
FROM folder_closure fc
JOIN folders f ON fc.descendant = f.id
WHERE fc.ancestor = 2
ORDER BY fc.depth;

Output:

Using the Closure Table Model in MySQL

Explanation: Here, the JOIN Statement joins the folder_closure table with folders using descendant = id, which retrieves all the descendants of Documents

Method 6: Using the Path Enumeration Model in MySQL

This Path Enumeration Model represents the hierarchical data by storing the full path of each node as a string.

Syntax:

CREATE TABLE table_name (
    id INT PRIMARY KEY,       
    name VARCHAR(100),        
);

Example:

Let us use the following table as an example for a better understanding.

-- Create table structure using path enumeration
CREATE TABLE categories (
    id INT PRIMARY KEY,
    category_name VARCHAR(100),
    path VARCHAR(255) UNIQUE
);

-- Insert some values into it
INSERT INTO categories (id, category_name, path) VALUES
(1, 'Electronics', '/1'),  
(2, 'Mobiles', '/1/2'),
(3, 'Laptops', '/1/3'),
(4, 'Accessories', '/1/2/4'),
(5, 'Chargers', '/1/2/4/5'),
(6, 'Gaming', '/1/3/6');

-- Query to find all subcategories of ‘Mobiles’ (id=2)
SELECT * FROM categories WHERE path LIKE '/1/2/%';

Output:

Using the Path Enumeration Model in MySQL

Explanation: The SELECT query effectively retrieves all of the mobile subcategories by identifying all categories whose paths begin with /1/2/.

Alternative Methods to Create a Hierarchical Recursive Query in MySQL JSON or NOSQL

Alternative methods like the Graph-Based Approach and document-based structures can also be used to create a Hierarchical Recursive Query in MySQL JSON or NOSQL

Using Graph-Based Approach in MySQL JSON or NOSQL

In the graph-based approach, the hierarchical relationships are stored in a JSON column of a MySQL table.

Syntax:

CREATE TABLE table_name (
    id INT PRIMARY KEY,  
    name VARCHAR(100),  
    hierarchy JSON  
);

Example:

CREATE TABLE table_name (
    id INT PRIMARY KEY,  
    name VARCHAR(100),  
    hierarchy JSON  
);

INSERT INTO table_name (id, name, hierarchy) VALUES
(1, 'Root', JSON_ARRAY()),  
(2, 'Category A', JSON_ARRAY(1)),  
(3, 'Subcategory A1', JSON_ARRAY(2)),  
(4, 'Category B', JSON_ARRAY(1)),  
(5, 'Subcategory B1', JSON_ARRAY(4));

-- Query to find direct children of a Parent
SELECT name, hierarchy  
FROM table_name  
WHERE JSON_CONTAINS(hierarchy, '1');  

Output:

Using Graph-Based Approach in MySQL JSON or NOSQL

Explanation: A node with an ancestor of 1 can be effectively filtered using the JSON_CONTAINS() function.

Performance Comparison of Each Method

Method Use Case Pros Cons
Recursive CTE This can be used when dynamic hierarchical queries are needed Easy to write It is not available in older versions of MySQL
Adjacency List Model This can be used for storing simple parent-child relationships Small hierarchies can be easily maintained and implemented Due to multiple self-joins, it performs poorly on deep hierarchies
Nested Set Model This can be used when retrieving the entire subtree is common Descendants can be retrieved in a single query Update and insert operations are complex
Materialized Path Model This can be used when hierarchical paths have to be stored as a string for traversal Easy to navigate without joins Requires string manipulation when updating paths
Closure Table Model Utilized in programs that need to seek ancestors and descendants frequently Supports complex Queries Extra space is needed because it stores all possible pairs in the table
Path Enumeration Model Effectively supports file system structures and other hierarchical searches by keeping indexed paths Enhancing the index supports deep hierarchies String operations are expensive

Real-World Examples

1. Organization Hierarchy 

Every employee has a manager in a database that houses the organization’s personnel hierarchy. For a particular employee, we must retrieve the complete hierarchy.

Example:

-- Create an Employee table
CREATE TABLE employee (
    id INT PRIMARY KEY,
    name VARCHAR(100),
    manager_id INT NULL
);

-- Insert some values into it
INSERT INTO employee (id, name, manager_id) VALUES
(1, 'CEO', NULL),
(2, 'Manager A', 1),
(3, 'Manager B', 1),
(4, 'Employee 1', 2),
(5, 'Employee 2', 2),
(6, 'Employee 3', 3);

-- Query to find the Employee hierarchy
WITH RECURSIVE EmployeeHierarchy AS (
    SELECT id, name, manager_id, 1 AS level
    FROM employee
    WHERE id = 1  
    UNION ALL
    SELECT e.id, e.name, e.manager_id, eh.level + 1
    FROM employee e
    INNER JOIN EmployeeHierarchy eh ON e.manager_id = eh.id
)
SELECT * FROM EmployeeHierarchy;

Output:

1. Organization Hierarchy 

Explanation: By assigning a hierarchy level and identifying which employees report to which management, the recursive query in this instance dynamically constructs the organizational ladder.

2. Product Categories

E-commerce aims to effectively retrieve all subcategories when product details are stored in a hierarchical structure.

Example:

-- To create a category table
CREATE TABLE categories (
    id INT PRIMARY KEY,
    category_name VARCHAR(100) NOT NULL,
    lft INT NOT NULL,
    rgt INT NOT NULL
);

-- To insert some values
INSERT INTO categories (id, category_name, lft, rgt) VALUES
(1, 'Electronics', 1, 10),
(2, 'Laptops', 2, 5),
(3, 'Gaming Laptops', 3, 4),
(4, 'Phones', 6, 9),
(5, 'Smartphones', 7, 8);

-- Query to find all subcategories of Electronics
SELECT c2.category_name
FROM categories c1
JOIN categories c2 ON c2.lft BETWEEN c1.lft AND c1.rgt
WHERE c1.category_name = 'Electronics';

Output:

2. Product Categories

Explanation:  Through the selection of categories whose lft and rgt values are within the limits of “Electronics,” this query retrieves all of the subcategories of that heading.

Best Practices

  • Choose a Suitable Model based on performance, complexity, and functionality.
  • Create the indices on the hierarchy columns that will give a better query performance, primarily for deep hierarchies.
  • Maintain Data Integrity on hierarchical relationships, which is done through the use of foreign keys.
Start Your SQL Adventure!
Start Your SQL Journey for Free Today
quiz-icon

Conclusion

Making the correct model choice is essential for effective hierarchical data management in MySQL. Every method has its advantages and applications, including Recursive CTE, Adjacency List, Nested Set, Materialized Path, Closure Table, and Path Enumeration. You now have a better understanding of the various approaches to developing hierarchical recursive queries in MySQL, as well as the best techniques for putting them into practice.

To learn more about SQL functions, check out this SQL course and also explore SQL Interview Questions prepared by industry experts.

Hierarchical Recursive Query in MySQL – FAQs

Q1. How can hierarchical data be handled in SQL?

Using a Recursive Common Table Expression (CTE), the hierarchical data in SQL can be handled.

Q2. What do you mean by recursive hierarchy?

A recursive hierarchy in Master Data Services is a derived hierarchy with a recursive relationship.

Q3. How can a recursive SQL query be created?

The anchor member and the recursive member are the two components of a CTE that must be defined to construct a recursive SQL query.

Q4. What is meant by the phrase "hierarchy of queries in SQL"?

SQL queries use recursive CTEs or hierarchical queries similar to CONNECT BY for fetching hierarchical data.

Q5. How to use MySQL for retrieving hierarchical data?

Recursive Common Table Expressions (CTEs) are a useful tool for effectively retrieving hierarchical data in MySQL.

About the Author

Data Engineer

As a skilled Data Engineer, Sahil excels in SQL, NoSQL databases, Business Intelligence, and database management. He has contributed immensely to projects at companies like Bajaj and Tata. With a strong expertise in data engineering, he has architected numerous solutions for data pipelines, analytics, and software integration, driving insights and innovation.

business intelligence professional