When working with PostgreSQL, you will have to deal with complex data retrieval and manipulation tasks. Sometimes, you will need to use subqueries and LATERAL JOIN, which can be helpful for you. Although both give similar results, there are important differences in their characteristics and use cases. In this blog, you will understand LATERAL joins and subqueries and how each is different in detail.
Table of Contents:
Creating Sample Table
Before getting started with the methods, let us create a Student table, a Course table, and an enrollment table that can be used as an example for the following methods in PostgreSQL
Creating a table
CREATE TABLE students (
student_id INT PRIMARY KEY,
student_name VARCHAR(50),
email VARCHAR(100) UNIQUE
);
-- Inserting data
INSERT INTO students (student_id, student_name, email) VALUES
(1, 'Priya Sharma', '[email protected]'),
(2, 'Rahul Verma', '[email protected]'),
(3, 'Sneha Gupta', '[email protected]');
-- Displaying the students table
SELECT * FROM students;
Output:
This is how the Student table looks after creating and inserting the values.
Creating table
CREATE TABLE courses (
course_id INT PRIMARY KEY,
course_name VARCHAR(100),
duration_months INT,
price DECIMAL(8, 2)
);
-- Inserting data
INSERT INTO courses (course_id, course_name, duration_months, price) VALUES
(101, 'Data Science with Python', 6, 1200.00),
(102, 'Full Stack Web Development', 8, 1500.00),
(103, 'AWS Certified Solutions Architect', 4, 900.00),
(104, 'Digital Marketing Master Program', 5, 1100.00);
-- Displaying table
SELECT * FROM courses;
Output:
This is the Course table that looks after creating and inserting the values in PostgreSQL.
-- Creating table
CREATE TABLE enrollments (
enrollment_id INT PRIMARY KEY,
student_id INT REFERENCES students(student_id),
course_id INT REFERENCES courses(course_id),
enrollment_date DATE
);
-- Inserting data
INSERT INTO enrollments (enrollment_id, student_id, course_id, enrollment_date) VALUES
(201, 1, 101, '2024-03-15'),
(202, 1, 102, '2024-04-01'),
(203, 2, 101, '2024-03-20'),
(204, 3, 103, '2024-03-25'),
(205, 3, 104, '2024-04-05');
-- Displaying table
SELECT * FROM enrollments;
Output:
This is the enrollment table that looks after creating and inserting data into it.
Master LATERAL JOIN vs Subqueries in PostgreSQL
Enroll now and gain the edge in advanced PostgreSQL querying!
LATERAL JOIN In PostgreSQL
A LATERAL JOIN is used to enable a subquery in the FROM clause to reference columns from the previous tables in the same query. The LATERAL JOIN may be useful when utilizing functions that are generated to return a set of rows. The LATERAL JOIN is a kind of cross join to a subquery that has access to columns from the other table.
Syntax:
SELECT column
FROM table1
CROSS JOIN LATERAL (subquery) AS alias;
Examples of LATERAL JOIN:
Finding the most recent course in which students are enrolled.
SELECT s.student_name, e.enrollment_date, c.course_name
FROM students s
LEFT JOIN LATERAL (
SELECT enrollment_date, course_id
FROM enrollments
WHERE student_id = s.student_id
ORDER BY enrollment_date DESC
LIMIT 1
) AS e ON TRUE
LEFT JOIN courses c ON e.course_id = c.course_id;
Output:
Explanation: Here, it finds out the students who have enrolled in Intellipaat courses recently.
Subqueries in PostgreSQL
A subquery refers to a SQL query that is run within another SQL query (also known as an outer query or nested query). It is a way to produce a multi-step data retrieval process in a single SQL statement. The main purpose of a subquery is to provide information or a collection of results for the outer query to use in selection, filtering, etc.
Let’s understand subqueries with the example:
Find the names of the students who are enrolled in Intellipaat’s ‘data science with Python’ course.
Output:
SELECT student_name
FROM students
WHERE student_id IN (
SELECT student_id
FROM enrollments
WHERE course_id = (
SELECT course_id
FROM courses
WHERE course_name = 'Data Science with Python'
)
);
Output:
Explanation: Here, this query finds out the names of the Students from Intellipaat who are enrolled in the Data Science Course with Python.
Differences between LATERAL JOIN and Subqueries
Feature |
LATERAL JOIN |
Subqueries |
Row Processing |
Processing works on data row by row. |
Processing works on the entire result set. |
Purpose |
It is used to combine data from multiple tables into one |
They are used to sort, filter, and aggregate data in the query |
Duplicates |
They can create duplicate values if multiple tables have similar data. |
They are used to remove duplicates. |
Primary Use Cases |
Used in row calculation. |
Simple filtering, aggregation. |
Clarity |
Joins make queries easy to understand. |
Subqueries are difficult to read and understand. |
Advantages of LATERAL JOIN and Subqueries
LATERAL JOIN |
Subqueries |
LATERAL JOIN provides better data modification. |
Fetching data is simple in Subqueries. |
LATERAL JOIN allows powerful row-level operations. |
Subqueries work better in PostgreSQL. |
LATERAL JOIN allows row-wise data processing. |
Subqueries are readable. |
LATERAL JOINS support difficult queries. |
Subqueries support easy queries |
Disadvantages of LATERAL JOIN and Subqueries
LATERAL JOIN |
Subqueries |
LATERAL JOIN queries are difficult. |
Subqueries have limitations with complex operations. |
LATERAL JOIN can introduce performance overhead. |
Connected subqueries can be inefficient. |
LATERAL JOIN makes debugging difficult. |
Complex connection makes subqueries difficult. |
LATERAL JOINS have a potential performance issue. |
Subqueries with null-handling are difficult. |
Get 100% Hike!
Master Most in Demand Skills Now!
Real-world Examples of LATERAL JOINS
These are the following real-world examples of Lateral Joins
1. Generating Reports With a Running Balance
A financial company wants to generate a report showing each transaction with a running balance for the associated account
-- Create the transactions table
CREATE TABLE transactions (
transaction_id SERIAL PRIMARY KEY,
account_id INT,
transaction_date DATE,
transaction_amount DECIMAL(10, 2)
);
-- Sample Data
INSERT INTO transactions (account_id, transaction_date, transaction_amount) VALUES
(1, '2023-11-01', 100.00), (1, '2023-11-05', -50.00), (2, '2023-11-10', 200.00),
(1, '2023-11-15', 75.00), (2, '2023-11-20', -100.00);
-- Query using LATERAL JOIN
SELECT
t.transaction_id,
t.account_id,
t.transaction_date,
t.transaction_amount,
running_balance.balance
FROM
transactions t,
LATERAL (
SELECT SUM(transaction_amount) AS balance
FROM transactions
WHERE account_id = t.account_id AND transaction_date <= t.transaction_date
) AS running_balance;
Output:
Explanation: Here, the running balance for each account is calculated and uses a LATERAL join, which allows the subquery to access the current transaction’s account_id and transaction_date to calculate the running balance.
2. Calculate the SUM
A company has a table of sales transactions and wants to calculate the running total of sales for each customer and each transaction.
-- Create the sales table
CREATE TABLE sales (
sale_id SERIAL PRIMARY KEY,
customer_id INT,
sale_date DATE,
sale_amount DECIMAL(10, 2)
);
-- Insert sample data
INSERT INTO sales (customer_id, sale_date, sale_amount) VALUES
(1, '2023-11-01', 100.00),
(1, '2023-11-05', 150.00),
(2, '2023-11-02', 200.00),
(1, '2023-11-10', 75.00),
(2, '2023-11-08', 120.00);
-- Use LATERAL to calculate SUM
SELECT
s.sale_id,
s.customer_id,
s.sale_date,
s.sale_amount,
running_total.total_amount
FROM
sales s,
LATERAL
SELECT SUM(sale_amount) AS total_amount
FROM sales
WHERE customer_id = s.customer_id AND sale_date <= s.sale_date
) AS running_total;
Output:
Explanation: Here, we have correlated the subquery results to each row and calculated a running total of sales for each customer by summing the sales amount through each sale’s date using LATERAL JOIN.
Real-World Examples of Subqueries
These are the following real-world examples of subqueries
1. Finding Customers Who Have Placed Atleast One Order
An e-commerce company wants to find out the names of the customers who placed orders greater than $250.
--Create table customers
CREATE TABLE customers (
customer_id INT PRIMARY KEY,
customer_name VARCHAR(50),
region VARCHAR(50)
);
-- Insert values
INSERT INTO customers (customer_id, customer_name, region) VALUES
(1, 'Alice', 'North'),
(2, 'Bob', 'South'),
(3, 'Charlie', 'North'),
(4, 'David', 'East');
--Create table orders
CREATE TABLE orders (
order_id INT PRIMARY KEY,
customer_id INT REFERENCES customers(customer_id),
order_date DATE,
order_amount DECIMAL(10, 2)
);
--Insert data
INSERT INTO orders (order_id, customer_id, order_date, order_amount) VALUES
(101, 1, '2023-11-01', 150.00),
(102, 2, '2023-11-02', 200.00),
(103, 3, '2023-11-03', 260.00),
(104, 4, '2023-11-30', 300.00);
--Displaying the result
Select * from orders;
--Subquery to select customers
SELECT customer_name
FROM customers
WHERE customer_id IN (
SELECT customer_id
FROM orders
WHERE order_amount > 250.00
);
Output:
Explanation: Here, the WHERE Clause filters the returns to the name of the customer who has made the orders with an amount greater than 250.00.
2. Finding Employees in Departments with Above Average Salaries
A company wants to find out if employees in respective departments have salaries above average.
-- Creating table departments
CREATE TABLE departments (
dept_id SERIAL PRIMARY KEY,
dept_name VARCHAR(255)
);
--Creating the table employees
CREATE TABLE employees (
emp_id SERIAL PRIMARY KEY,
emp_name VARCHAR(255),
dept_id INT REFERENCES departments(dept_id),
salary DECIMAL(10, 2)
);
-- Inserting Data
INSERT INTO departments (dept_name) VALUES ('Sales'), ('Engineering'), ('HR');
INSERT INTO employees (emp_name, dept_id, salary) VALUES ('John', 1, 50000.00), ('Jane', 2, 70000.00), ('David', 1, 60000.00), ('Emily', 3, 45000.00), ('Michael', 2, 80000.00);
SELECT
e.emp_name
FROM
employees e
WHERE
e.dept_id IN (
SELECT
d.dept_id
FROM
departments d
WHERE
(SELECT SUM(salary) FROM employees WHERE dept_id = d.dept_id) > (SELECT AVG(total_salary) FROM (SELECT dept_id, SUM(salary) AS total_salary FROM employees GROUP BY dept_id) AS dept_salaries)
);
Output:
Explanation: Here, it calculates each department’s total salary, then the average of those totals is found, and returns the names of the employees whose salary is more than the average.
Unlock the Power of SQL
Dive into SQL – Learn the Basics for Free!
Conclusion
Subqueries and LATERAL JOIN are both alternative tools in PostgreSQL used for different purposes in the query. Subqueries can be used optimally for data fetching, such as filtering and aggregations. LATERAL JOIN used some other advanced features, such as row-by-row processing and dynamic calculations. Although subqueries are very simple to use, generally, LATERAL JOIN is very efficient for handling larger data.
To learn more about SQL functions, check out this SQL course and also explore SQL Interview Questions prepared by industry experts.
Lateral join vs Subqueries -FAQs
Q1. Can subqueries reference columns from the outer query?
Yes, subqueries can reference columns from the outer query, but only if they are correlated with the subquery.
Q2. What is the primary benefit of LATERAL JOIN?
The primary benefit of LATERAL JOIN is that it allows you to perform row-by-row operations and reference columns from preceding FROM items.
Q3. Is LATERAL JOIN always better than a subquery?
No LATERAL JOIN is a powerful tool, but it can be more complex to use for simple data.
Q4. Should I use a subquery?
Yes, you can use a subquery for simple filtering.
Q5. How do I handle set-returning functions with subqueries?
You can use UNNEST, but LATERAL JOIN will help to simplify this process.