How to SELECT Rows with MAX PARTITION BY Another Column in MySQL

How to SELECT Rows with MAX PARTITION BY Another Column in MySQL

Answer: In MySQL, the rows with a max(Column value) can be selected using ROW_NUMBER().

The SELECT row with MAX(column value) function is important as it helps to retrieve the rows with the highest value in the specific column for each partitioned group. In this blog, you will explore how we can SELECT rows with MAX(Column value), and PARTITION BY another column in MySQL.

Table of Contents:

First, let’s create a table for Employees consisting of EmployeeID, Department, and salary, and use it for all the examples.

–Creation of employees table
CREATE TABLE employees (
    id INT PRIMARY KEY,
    Department VARCHAR(50),
    salary INT
);

Now let us insert some of the values into the employee’s table using the INSERT INTO command.

–Inserting values into employees table
INSERT INTO employees (id, department, salary) VALUES
(1, 'BDA', 5000),
(2, 'BDA', 7000),
(3, 'TCW', 6000),
(4, 'TCW', 9000),
(5, 'TRA', 7500),
(6, 'TRA', 8500);

--To display the Employees table 
Select * from employees ;

This is how the employees table looks like once it is created and inserted with the values.

To select rows with MAX(column value), PARTITION BY another column in MySQL. Let’s first explore how the SELECT command works.

SELECT in MySQL

The select statement helps to retrieve the data from one or more tables in the database.

Syntax: 

Select ColumnName1, ColumnName2 
From table_name
Where condition
Order by column1 asc ;

Example: 

SELECT *
 from employees;

Output: 

Explanation: Here the SELECT * retrieves all the data from the employees table.

Select Rows with MAX() in MySQL 

Let us select rows with maximum salary using MAX(), which is a type of aggregate function that returns the maximum value of a column for a set of rows.

Syntax: 

MAX(colunm_value)

Example: 

Select MAX(Salary) 
From employee;

Output: 

Explanation: It will return the highest salary across all rows in the employee table by comparing the salaries of all the employees. 

PARTITION BY in MySQL 

In MySQL, PARTITION BY is not used with the select statement rather it can be used along with the Windows function. PARTITION BY, uses many Windows function like ROW_NUMBER, RANK(), DENSE_RANK(), SUM(), and AVG(). Now let us learn about these Windows functions in detail.

Windows function in MySQL 

Windows function in MySQL calculates across a set of rows without collapsing the rows into a single result similar to the GROUP BY clause. This helps in returning the max value.

Windows  FunctionDescription
                ROW_NUMBER()Assigns a ranking but skips numbers if there are ties.
                      RANK()Assigns a ranking but skips numbers if there are ties.
                  DENSE_RANK()Like RANK(), but without skipping numbers for ties.
                        SUM()It computes the total within a Partition.
                        AVG()It computes the average value within a Partition.

Syntax for Windows function 

SELECT column1, column2, 
       window_function() OVER (PARTITION BY column_name ORDER BY column_name DESC) AS alias
FROM table_name;

Now let us know about how this PARTITION BY can be used with the Windows function.

1. PARTITION BY using ROW_NUMBER() in MySQL

ROW_NUMBER() is a Window function that assigns a unique row number to each row within a partition or a group.

Syntax: 

SELECT column1, column2, 
       ROW_NUMBER() OVER (PARTITION BY partition_column ORDER BY order_column DESC) AS row_num
FROM table_name;

Example:

Let us assign row numbers based on the highest salary in each department using the

ROW_NUMBER()

SELECT id, department, salary,
       ROW_NUMBER() OVER (PARTITION BY department ORDER BY salary DESC) AS row_num
FROM employees;

Output: 

Explanation: The PARTITION BY Department, groups employees by the department, and the ORDER BY salary DESC will first assign row numbers based on the highest salary. The maximum salary paid in each department is displayed

2. PARTITION BY using RANK() or DENSE_RANK() in MySQL

RANK() is a Window function that assigns a ranking to each row within a partition. However, if there are duplicate values, the same rank is assigned to those rows and the next rank is skipped. Whereas in DENSE_RANK() it does not skip ranking numbers when the ranks are tied and returns all the ranks.

Syntax:

SELECT id, department, salary,

       RANK() OVER (PARTITION BY department ORDER BY salary DESC) AS rank_num,

       DENSE_RANK() OVER (PARTITION BY department ORDER BY salary DESC) AS dense_rank_num

FROM employees;

Let us consider employee tables with duplicate values for a better understanding of the RANK() and DENSE_RANK() functions 

Explanation: Here, the salary field contains duplicate values so let us implement the RANK() and DENSE_RANK() functions.

Example: 

SELECT id, department, salary,

       RANK() OVER (PARTITION BY department ORDER BY salary DESC) AS rank_num,

       DENSE_RANK() OVER (PARTITION BY department ORDER BY salary DESC) AS dense_rank_num

FROM employees;

Output:

Explanation: Here the ranks of employees within each department are based on salary in descending order. RANK() skips numbers for ties, while DENSE_RANK() returns all without skipping.

3. PARTITION BY using MAX() with JOIN in MySQL

If you are using an older version of MySQL that does not support the Window function then you can achieve the same result by using MAX() with JOIN.

Syntax:

SELECT column1, column2, ...

FROM table_name t1

JOIN (

    SELECT partition_column, MAX(aggregate_column) AS max_value

    FROM table_name

    GROUP BY partition_column

) t2 

ON t1.partition_column = t2.partition_column AND t1.aggregate_column = t2.max_value;

Example: 

SELECT e.id, e.department, e.salary

FROM employees e

JOIN (

    SELECT department, MAX(salary) AS max_salary

    FROM employees

    GROUP BY department

) m 

ON e.department = m.department AND e.salary = m.max_salary;

Output:

Explanation: As the department BDA has two employees with the highest salary both are included in the output.

4. PARTITION BY Using MAX() with WHERE EXISTS in MySQL

This approach is efficient because it avoids JOIN operation making it more efficient for larger datasets.

Syntax:

SELECT column1, column2, ...
FROM table_name t1
WHERE EXISTS (
    SELECT 1
    FROM table_name t2
    WHERE t1.partition_column = t2.partition_column
    GROUP BY t2.partition_column
    HAVING MAX(t2.aggregate_column) = t1.aggregate_column
);

Example: 

SELECT e.id, e.department, e.salary
FROM employees e
WHERE EXISTS (
    SELECT 1 
    FROM employees e2
    WHERE e.department = e2.department
    GROUP BY e2.department
    HAVING MAX(e2.salary) = e.salary
);

Output:

Explanation: The HAVING MAX() function ensures that employees with the highest salaries from each department are selected.

Alternative Approaches for Selecting the Rows

There are two alternative approaches for selecting the rows:

1. SUBQUERY with INNER JOIN 

We can use an INNER JOIN to join the employee’s table with itself or another table (if available). The subquery can be used to retrieve specific data that can be joined with the main query.

Syntax:

SELECT t1.*
FROM your_table t1
INNER JOIN (
    SELECT partition_column, MAX(value_column) AS max_value
    FROM your_table
    GROUP BY partition_column
) t2 ON t1.partition_column = t2.partition_column AND t1.value_column = t2.max_value;

Example: 

SELECT e1.*
FROM employees e1
INNER JOIN (
    SELECT department, MAX(salary) AS max_salary
    FROM employees
    GROUP BY department
) e2 ON e1.department = e2.department
AND e1.salary = e2.max_salary;

Output:

Explanation: The inner query (e2) finds the maximum salary (MAX(salary)) for each department, grouping by department.

2. CORRELATED SUBQUERY

A correlated subquery is a subquery that runs once for each row in the main query. It refers to a column from the main query, making it useful for row-by-row comparisons

Syntax:

SELECT t1.*
FROM your_table t1
WHERE value_column = (
    SELECT MAX(value_column)
    FROM your_table t2
    WHERE t1.partition_column = t2.partition_column
);

Example: 

SELECT e1.*
FROM employees e1
WHERE salary = (
    SELECT MAX(e2.salary)
    FROM employees e2
    WHERE e1.department = e2.department
);

Output:

Explanation: The subquery finds the maximum salary in the same department (e1.department = e2.department).

How does MAX() with JOIN differ from MAX() with WHERE EXISTS?

Approach Performance Use Case
MAX() with JOINCan be slower for large datasets When you need extra columns from both tables 
MAX() with WHERE EXISTSMore Optimised by avoiding duplicate scans When you only need filtered data.

Performance Comparison of Different Windows Functions in PARTITION BY

ApproachPerformanceProsCons
ROW_NUMBER()Fastest with an indexIt is optimized using indexing and also handles the duplicates well.Not available in MySQL version < 8.0
RANK() Fast, but slower than ROW_NUMBER()Returns all rows with the max value per partition if there are tiesCan return multiple rows per partition 
MAX() wit JOIN()Efficient with proper indexing Uses GROUP BY + JOIN for filteringMight slow down with large datasets
MAX() with WHERESlowest due to correlated subqueriesNo need for CTE or any advanced window functions Extremely slow for large datasets

Real-world Use Cases 

1. E-Commerce System: This is essential for tracking the most recent purchase, which can be used for customer support or marketing campaigns.

Example: 

--Create an order table 
CREATE TABLE orders (
    order_id INT PRIMARY KEY,
    customer_id INT,
    order_date DATE,
    total_amount DECIMAL(10, 2),
    INDEX (customer_id, order_date)
);

--Insert sample data into the order table
INSERT INTO orders (customer_id, order_date, total_amount) VALUES
(101, '2024-01-01', 100.00),
(102, '2024-02-01', 200.00),
(101, '2024-03-01', 150.00),
(103, '2024-01-15', 300.00),
(102, '2024-04-10', 250.00);

--Query to rank the orders
WITH ranked_orders AS (
    SELECT *,

           ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY order_date DESC) AS rn
    FROM orders
)
SELECT * 
FROM ranked_orders 
WHERE rn = 1;

Output: 

Explanation: In this query, For each customer, ROW_NUMBER() ranks their orders starting from 1 for the most recent order and continues.

2. HR System: In an HR system, when we want to track the latest salary of each employee.

Example:

– Create a table 
CREATE TABLE employee_salaries (
    salary_id INT PRIMARY KEY,
    employee_id INT,
    salary DECIMAL(10, 2),
    salary_date DATE,
    INDEX idx_employee_salary (employee_id, salary_date)
);

--Inserting the sample data into the table
INSERT INTO employee_salaries (salary_id, employee_id, salary, salary_date) VALUES
(1, 101, 50000.00, '2024-01-01'),
(2, 102, 60000.00, '2024-02-01'),
(3, 101, 55000.00, '2024-03-01'),
(4, 103, 70000.00, '2024-01-15'),
(5, 102, 65000.00, '2024-04-10');

-- To display the latest salary of each department 
WITH ranked_salaries AS (
    SELECT *,

           ROW_NUMBER() OVER (PARTITION BY employee_id ORDER BY salary_date DESC) AS rn
    FROM employee_salaries
)
SELECT * 
FROM ranked_salaries 
WHERE rn = 1;

Output: 

Explanation: This query will return the latest salary for each employee based on the most recent salary date.

3. Web Application: In a web application, when we track user logins, and need to fetch each user’s latest login activity.

Example:

-- Create a user_logins table 
CREATE TABLE user_logins (
    login_id INT PRIMARY KEY,
    user_id INT,
    login_time DATETIME,
    INDEX idx_user_login (user_id, login_time)
);

-- Insert some values into it 
INSERT INTO user_logins (login_id, user_id, login_time) VALUES
(1, 101, '2024-01-01 08:00:00'),
(2, 102, '2024-02-01 09:00:00'),
(3, 101, '2024-03-01 10:30:00'),
(4, 103, '2024-01-15 11:00:00'),
(5, 102, '2024-04-10 14:20:00');

--Query to get the most recent Login for each user
WITH ranked_logins AS (
    SELECT *,

           ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY login_time DESC) AS rn
    FROM user_logins
)
SELECT * 
FROM ranked_logins 
WHERE rn = 1;

Output: 

Explanation: For each user_id, the logins are ranked by login_time in descending order (most recent first).

Conclusion

Selecting rows with MAX(Column Value) partitioned by another column is a common SQL problem with different solutions based on the MySQL version. In MySQL 8.0+, window functions like RANK() OVER (PARTITION BY …) offer the most efficient approach. For older versions, you can use JOIN with MAX() or WHERE EXISTS. Understanding these methods helps in effectively selecting the MAX(Column Value) from a table.

FAQs

1. What if multiple rows have the same max value?

RANK() assigns the same rank to all, while ROW_NUMBER() picks only one row per group, even if values are the same.

2. Can I use MAX() directly with PARTITION BY?

No, MAX() is an aggregate function and works with GROUP BY. To use it with PARTITION BY, you need a window function like RANK().

3. What is the difference between RANK() and DENSE_RANK() ?

RANK() skips numbers when there are ties, while DENSE_RANK() assigns consecutive numbers without skipping any ranks.

About the Author

Data Engineer

As a skilled Data Engineer, Sahil excels in SQL, NoSQL databases, Business Intelligence, and database management. He has contributed immensely to projects at companies like Bajaj and Tata. With a strong expertise in data engineering, he has architected numerous solutions for data pipelines, analytics, and software integration, driving insights and innovation.

business intelligence professional