How to Get the Last Record in Each Group Using MySQL?

How to Get the Last Record in Each Group Using MySQL?

The methods, like JOIN with MAX() and ROW_NUMBER(), retrieve the last record in each group using MySQL. Sometimes we need to retrieve the last record from the group in chats, transactions, or academics who ranked last. In these scenarios, we can use methods to retrieve the last record. 

Table of Contents:

Create a Sample Table

Let’s create a sample table data to perform these queries:

CREATE TABLE employees (
    user_id INT PRIMARY KEY,
    name VARCHAR(100) NOT NULL
);
INSERT INTO employees (user_id, name) VALUES
(1, 'Karan'),
(2, 'Ralph'),
(3, 'Lisa');

Output:

This is the table of employees names and their user IDs.

CREATE TABLE login_attempts (
    id INT AUTO_INCREMENT PRIMARY KEY,
    user_id INT NOT NULL,
    login_time DATETIME NOT NULL,
    status VARCHAR(10) NOT NULL
);
INSERT INTO login_attempts (user_id, login_time, status) VALUES
(1, '2024-02-15 10:00:00', 'Success'),
(1, '2024-02-15 14:15:00', 'Success'),
(2, '2024-02-15 09:45:00', 'Failed'),
(2, '2024-02-15 11:00:00', 'Success'),
(3, '2024-02-15 14:50:00', 'Failed'),
(3, '2024-02-15 15:30:00', 'Success');
ALTER TABLE login_attempts ADD CONSTRAINT fk_user FOREIGN KEY (user_id) REFERENCES employees(user_id);

Output:

This table displays the login time of each employee

Methods to Get the Last Record in Each Group using MySQL

In MySQL, we first group the data using GROUP BY. Later, we use methods to retrieve the last record in each group. 

Method 1: Using JOIN with MAX() in MySQL 

The JOIN with MAX() function will combine the two tables and fetch the last record of each person’s login. 

Example:

SELECT e.user_id, e.name, l.login_time, l.status
FROM employees e
JOIN login_attempts l 
ON e.user_id = l.user_id
WHERE l.login_time = (
    SELECT MAX(login_time) 
    FROM login_attempts
);

Output:

Explanation: The JOIN command fetched the details of each person’s last login time.

Method 2: Using SELF JOIN in MySQL

The SELF JOIN in MySQL compares each value with another value with the same condition. This can be achieved using a left join.

Example:

SELECT la1.*
FROM login_attempts la1
LEFT JOIN login_attempts la2
    ON la1.login_time < la2.login_time
WHERE la2.login_time IS NULL
LIMIT 1;

Output:

Explanation: The left join in MySQL retrieved the last record in a table. It compares the login time of Table 1 with Table 2 then gives the result. 

Method 3: Using ROW_NUMBER() in MySQL

The ROW_NUMBER() function in MySQL is used to get the last record in each group with the PARTITION BY command.

Example:

SELECT l.id, l.user_id, e.name, l.login_time, l.status
FROM (
    SELECT *, ROW_NUMBER() OVER (ORDER BY login_time DESC) AS rn
    FROM login_attempts
) l
JOIN employees e ON l.user_id = e.user_id
WHERE l.rn = 1;

Output:

Explanation: The ROW_NUMBER by PARTITION BY and the ORDER BY login_time DESC will fetch data in descending order. The outer query filters only the latest record per user (rn = 1).

Method 4: Using MAX(ID) in MySQL

The MAX(ID) in MySQL will retrieve the last record in the given table. 

Example:

SELECT l.id, l.user_id, e.name, l.login_time, l.status
FROM login_attempts l
JOIN employees e ON l.user_id = e.user_id
WHERE l.id = (
    SELECT MAX(id) FROM login_attempts
);

Output:

Explanation: The ID is an auto-incremented primary key, so the highest ID will have the last login attempt. 

Method 5: Using CTE and window functions in MySQL

The common table expression (CTE) with a window function like ROW_NUMBER will be more efficient; it is useful to make the data more readable and maintain the code properly in complex queries. 

Example:

WITH RankedAttempts AS (
    SELECT 
        l.id, 
        l.user_id, 
        l.login_time, 
        l.status, 
        ROW_NUMBER() OVER (ORDER BY l.login_time DESC) AS rn
    FROM login_attempts l
)
SELECT 
    r.id, 
    r.user_id, 
    e.name, 
    r.login_time, 
    r.status
FROM RankedAttempts r
JOIN employees e ON r.user_id = e.user_id
WHERE r.rn = 1;

Output:


Explanation: The CTE with row_number is another method where we can fetch the last record in each group even if there is a complex query. 

Real-world Examples

Example 1: Using JOIN with MAX(column):

CREATE TABLE orders (
    order_id INT PRIMARY KEY AUTO_INCREMENT,
    customer_id INT NOT NULL,
    order_date DATETIME NOT NULL,
    total_amount DECIMAL(10,2) NOT NULL
);
INSERT INTO orders (customer_id, order_date, total_amount) VALUES
(1, '2024-02-15 10:30:00', 150.00),
(1, '2024-02-16 12:00:00', 200.00),
(2, '2024-02-15 09:45:00', 300.00),
(2, '2024-02-17 14:15:00', 250.00),
(3, '2024-02-16 16:00:00', 100.00);
SELECT o.*
FROM orders o
JOIN (
    SELECT customer_id, MAX(order_date) AS latest_order
    FROM orders
    GROUP BY customer_id
) subquery
ON o.customer_id = subquery.customer_id 
AND o.order_date = subquery.latest_order
ORDER BY o.order_date DESC
LIMIT 1;

Output:

Explanation: The JOIN query with the MAX() subquery retrieved the details of customers who ordered last. 

Example 2: Using ROW_NUMBER() in a Subquery:

CREATE TABLE exam_scores (
    score_id INT PRIMARY KEY AUTO_INCREMENT,
    student_id INT NOT NULL,
    exam_date DATE NOT NULL,
    score INT NOT NULL
);
INSERT INTO exam_scores (student_id, exam_date, score) VALUES
(101, '2024-02-10', 85),
(101, '2024-02-15', 90),
(102, '2024-02-12', 78),
(102, '2024-02-14', 82),
(103, '2024-02-18', 88);
SELECT score_id, student_id, exam_date, score
FROM exam_scores
ORDER BY exam_date DESC, score_id DESC
LIMIT 1;

Output:

Explanation: Using ORDER BY exam_date DESC, score_id DESC, we retrieved the data based on the exam date. 

Example 3: Using MAX(id) (Best for Auto-Incremented Primary Keys)

CREATE TABLE support_tickets (
    ticket_id INT PRIMARY KEY AUTO_INCREMENT,
    user_id INT NOT NULL,
    created_at DATETIME NOT NULL,
    status VARCHAR(20) NOT NULL
);

INSERT INTO support_tickets (user_id, created_at, status) VALUES
(1, '2024-02-15 08:00:00', 'Open'),
(1, '2024-02-16 10:00:00', 'Closed'),
(2, '2024-02-15 12:30:00', 'Open'),
(2, '2024-02-17 09:45:00', 'Resolved'),
(3, '2024-02-18 14:20:00', 'Open');
-- Fetch only the most recent ticket across all users
SELECT * 
FROM support_tickets 
WHERE ticket_id = (SELECT MAX(ticket_id) FROM support_tickets);

 Output:

Explanation: The MAX(ID) fetched the data of recently booked tickets. 

Example 4: Using CTE with ROW_NUMBER():

CREATE TABLE product_prices (
    price_id INT PRIMARY KEY AUTO_INCREMENT,
    product_id INT NOT NULL,
    updated_at DATETIME NOT NULL,
    price DECIMAL(10,2) NOT NULL
);
INSERT INTO product_prices (product_id, updated_at, price) VALUES
(10, '2024-02-15 10:00:00', 25.99),
(10, '2024-02-16 14:30:00', 26.49),
(20, '2024-02-14 09:20:00', 50.00),
(20, '2024-02-17 11:45:00', 51.00),
(30, '2024-02-18 16:00:00', 75.50);
WITH LatestPrices AS (
    SELECT *, ROW_NUMBER() OVER (ORDER BY updated_at DESC) AS rn
    FROM product_prices
)
SELECT price_id, product_id, updated_at, price
FROM LatestPrices
WHERE rn = 1;

Output:

Explanation: The CTE with row_number handled the large dataset and fetched the latest updated price. 

Performance Comparison of Methods

MethodsPerformanceBest use caseSyntax
JOIN with MAX()Performs well with smaller and medium datasets. It has to use GroupBy because handling large datasets is harder.Best when the timestamps are unique with no duplicates.MAX(login_time) GROUP BY user_id
ROW_NUMBERIt works well in smaller datasets, but due to sorting and partitioning, it may slow down the performance. Works well when it needs ranking Best when precise ranking is needed, especially if there are duplicate timestamps.ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY login_time DESC)
MAX (ID) 
It is the fastest method because it uses index primary key lookup, which makes the records easily accessible.
It works best when there is a large dataset with an ID. As IDs increase with time, the auto-increment in the MAX(ID) function will be really helpful.WHERE id = (SELECT MAX(id) FROM login_attempts)
CTEIt also uses row_number(), but it is per CTE, so it improves the record’s readability and performance. It can even handle large subqueries. It works well with complex queries that need maintenance, and it will make the record flexible. WITH CTE AS (ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY login_time) DESC))

Conclusion

You can get the last record in each group in MySQL using JOIN with MAX(), ROW_NUMBER(), MAX(ID), or CTE. These methods sort data based on given conditions. Indexing makes them faster, and ROW_NUMBER() is efficient. But when the data updates often, it may slow down. In such cases, CTE with ROW_NUMBER() works better. Choose the method based on your data size. Understanding these methods helps you to effectively fetch the last record in each group using MySQL.

FAQs

1. How can I use JOIN with MAX() to get the last record in each group?

Use JOIN with the MAX() function to link each group with its latest record based on a specific column like date or ID.

2. How does ROW_NUMBER() help in getting the last record per group?

ROW_NUMBER() assigns a rank to each row in a group. By ordering rows in descending order and selecting the first one, you get the latest record.

3: When should I use CTE and window functions for this task?

Use CTE with window functions when you need better readability and complex queries, especially when combining multiple filtering or sorting conditions.

About the Author

Data Engineer

As a skilled Data Engineer, Sahil excels in SQL, NoSQL databases, Business Intelligence, and database management. He has contributed immensely to projects at companies like Bajaj and Tata. With a strong expertise in data engineering, he has architected numerous solutions for data pipelines, analytics, and software integration, driving insights and innovation.

business intelligence professional