SQL SELECT DISTINCT

Tutorial Playlist

SQL SELECT DISTINCT is a general statement used to fetch data using queries. It will make sure that there are no duplicate values in the columns by applying the DISTINCT keyword. The DISTINCT with the SELECT statement will remove all the duplicate rows from the table or column. The returned result will be of unique values based on a specific column where the DISTINCT keyword is used. In this article, we will learn about SELECT DISTINCT in SQL, its performance, and its use cases. 

Table of Contents:

What is DISTINCT in SQL?

The DISTINCT in SQL is a keyword that is used to fetch only the unique values in a column or multiple columns. It removes all the duplicates from the table and returns only the records that don’t have any duplicate values. DISTINCT can be used in a SELECT statement to specify the particular row or column on which it has to be applied.  

Syntax:

SELECT DISTINCT column1, column2, ...
FROM table_name;

Example:

CREATE TABLE employees (
    id INT,
    name VARCHAR(100),
    department VARCHAR(50),
    job_title VARCHAR(50)
);
INSERT INTO employees (id, name, department, job_title) VALUES
(1, 'Aarush', 'HR', 'Manager'),
(2, 'Binni', 'IT', 'Developer'),
(3, 'Gaurav', 'Marketing', 'Analyst');
SELECT DISTINCT department FROM employees;

Output:

 Distinct_create_table

Explanation: Here, the DISTINCT statement returned only the unique values and neglected the duplicate records from the table.

Why Do We Need To Use DISTINCT?

DISTINCT is a keyword that is used to remove all duplicate data from the table. It maintains the query integrity and helps you when there is a need to fetch the data. As all the values are unique, fetching the values will be easier. This DISTINCT keyword is very useful when handling reports or any important logs. 

When to Use DISTINCT?

We can use the DISTINCT keyword when we are querying a large dataset, where duplicates are inevitable. There will be unnecessary duplicates that need to be deleted at that time. The DISTINCT keyword is used to remove duplicates. When you need to create reports with all the unique values, you can use DISTINCT. 

Master SELECT DISTINCT Statement – Unlock Advanced SQL Skills Today!
Enroll now and transform your future!
quiz-icon

Ways to Use the DISTINCT Keyword in Columns

There are two syntaxes that can be used in single and multiple columns to get the unique records from the database. Distinct will get applied to the entire row of the specified column. It won’t apply to individual columns. It may affect the performance when working on a large dataset, as it needs to scan every column in the database to remove duplicates. 

Let’s create a dataset to get unique values from a single column and multiple columns. 

Example:

CREATE TABLE orders (
    order_id INT,
    customer_name VARCHAR(50),
    product_name VARCHAR(50)
);
INSERT INTO orders (order_id, customer_name, product_name) VALUES
(1, 'Karan', 'Laptop'),
(2, 'Yuva', 'Phone'),
(3, 'Karan', 'Laptop'),
(4, 'Yuva', 'Tablet'),
(5, 'Karan', 'Phone')

Using DISTINCT with a Single Column in SQL

You can apply the DISTINCT keyword to a particular column to get only the unique values from the table. 

Syntax:

SELECT DISTINCT column_name
FROM table_name;

Example:

SELECT DISTINCT customer_name FROM orders; 

Output:

Distinct_single column

Explanation: Here, the DISTINCT keyword fetched a single column (customer_name) from the orders table. Even though there are two Karan and two Yuva names in the table, the distinct keyword removed the duplicates and fetched only the unique values. 

Using DISTINCT with Multiple Columns in SQL

You can use DISTINCT for multiple columns at the same time to get all the unique or distinct values from multiple columns. 

Syntax:

SELECT DISTINCT column1, column2
FROM table_name;

Example:

CREATE TABLE orders (
    order_id INT,
    customer_name VARCHAR(50),
    product_name VARCHAR(50)
);
INSERT INTO orders (order_id, customer_name, product_name) VALUES
(1, 'Karan', 'Laptop'),
(2, 'Yuva', 'Phone'),
(3, 'Karan', 'Laptop'),
(4, 'Yuva', 'Tablet'),
(5, 'Karan', 'Charger');
SELECT DISTINCT customer_name, product_name FROM orders;

Output:

Distinct_mulitple_column

Explanation: Though the customer name returned twice, their product name differs, so the distinct keyword compared the two columns and returned the DISTINCT values. 

Using DISTINCT with Other SQL Clauses

The DISTINCT can be used with other clauses in SQL, like ORDER BY, WHERE, and LIMIT Clauses. 

Let’s create a dataset to perform DISTINCT with other SQL Clauses. 

CREATE TABLE customers (
    customer_id INT PRIMARY KEY,
    customer_name VARCHAR(50),
    city VARCHAR(50),
    states VARCHAR(50)
);
INSERT INTO customers (customer_id, customer_name, city, states) VALUES
(1, 'Johar', 'Mumbai', 'Maharashtra'),
(2, 'Babu', 'Lucknow', 'UP'),
(3, 'Tinku', 'Punjab', 'Chandigarh'),
(4, 'Chahar', 'Lucknow', 'UP');
SELECT * FROM customers;

Output:

distinct with other clauses table creation

This is how the table looks before applying other SQL Clauses. 

1. DISTINCT with ORDER BY in SQL

In a table, the DISTINCT will filter the duplicates from the table, and then the ORDER BY will sort the result based on the condition. 

Example:

SELECT DISTINCT city FROM customers
ORDER BY city ASC;

Output:

DISTINCT with ORDER BY

Explanation: The DISTINCT first removed the duplicates, and then the ORDER BY arranged the city names in ascending order. 

Get 100% Hike!

Master Most in Demand Skills Now!

2. DISTINCT with WHERE Clause in SQL

The WHERE clause will first filter the table based on a specific condition, and then the DISTINCT will apply to that to remove duplicates. 

Example:

SELECT DISTINCT city
FROM customers
WHERE states = 'UP'

Output:

DISTINCT-with-WHERE

Explanation: Here, the WHERE clause filtered the states named as “UP,” then the DISTINCT fetched the cities that matched the states. 

3. DISTINCT with LIMIT in SQL 

The DISTINCT will eliminate all the duplicates, and then the  LIMIT clause will limit the number of data points to be printed. 

Example:

SELECT DISTINCT states
FROM customers
LIMIT 4;

Output:

distinct with limit

 

Explanation: Here, the DISTINCT keyword first removes all duplicate states names from the customers table. Then, the LIMIT 4 clause restricts the output to only 4 unique states entries.

DISTINCT with Aggregate Functions in SQL

Many aggregate functions in SQL can be used with DISTINCT to fetch the unique values. 

1. Using COUNT() with DISTINCT in SQL

The DISTINCT will filter the table with only unique values, and the COUNT() function will count the number of unique values present in the table. 

Example:

SELECT COUNT(*) AS total_customers FROM customers;
SELECT COUNT(DISTINCT city) AS unique_cities FROM customers;

Output:

COUNT() with DISTINCT

Explanation: The COUNT() first counted the number of customers, and then the DISTINCT keyword fetched the cities the customer name matched with, then filtered the unique cities from them, and then the COUNT function counted the number of unique cities. 

2. Using SUM() with DISTINCT in SQL

The SUM() function adds all the total orders from the table. But if we apply DISTINCT to it, the SUM() will add the unique orders from the table. 

Example:

CREATE TABLE customers (
    customer_id INT PRIMARY KEY,
    customer_name VARCHAR(20),
    city VARCHAR(20),
    states VARCHAR(20)
);
INSERT INTO customers (customer_id, customer_name, city, states) VALUES
(1, 'Johar', 'Mumbai', 'Maharashtra'),
(2, 'Babu', 'Lucknow', 'UP'),
(3, 'Tinku', 'Punjab', 'Chandigarh'),
(4, 'Chahar', 'Lucknow', 'UP');
ALTER TABLE customers ADD total_orders INT;
UPDATE customers SET total_orders = 5 WHERE customer_id = 1;
UPDATE customers SET total_orders = 3 WHERE customer_id = 2;
UPDATE customers SET total_orders = 3 WHERE customer_id = 3;
UPDATE customers SET total_orders = 6 WHERE customer_id = 4;
SELECT SUM(DISTINCT total_orders) AS total_order_sum_distinct
FROM customers;

Output: 

COUNT() with DISTINCT

Explanation: Here, the SUM() function adds all values in the total_order column, including repeated ones.SUM(DISTINCT total_order) adds only the unique values from the column, counting each distinct number just once.

3. Using AVG() with DISTINCT in SQL

The AVG() will get the average number of customers based on the orders, and then the DISTINCT will filter out the duplicates. 

Example:

ALTER TABLE customers ADD total_orders INT;
UPDATE customers SET total_orders = 5 WHERE customer_id = 1;
UPDATE customers SET total_orders = 3 WHERE customer_id = 2;
UPDATE customers SET total_orders = 4 WHERE customer_id = 3;
UPDATE customers SET total_orders = 6 WHERE customer_id = 4;
SELECT AVG(DISTINCT total_orders) AS avg_distinct_orders
FROM customers;

Output:

AVG() with DISTINCT

Explanation: Here, the average of four orders will be 3.75, but we used DISTINCT, so it fetched only the unique value and calculated the average of only the distinct value. So, the total orders after using DISTINCT are 4.

DISTINCT in Views in MySQL

The Views can be used when you want to fetch the unique values frequently. It will work best on a MySQL server.

Example:

CREATE TABLE customers (
    customer_id INT PRIMARY KEY,
    customer_name VARCHAR(20),
    city VARCHAR(20),
    states VARCHAR(20)
);
INSERT INTO customers (customer_id, customer_name, city, states) VALUES
(1, 'Johar', 'Mumbai', 'Maharashtra'),
(2, 'Babu', 'Lucknow', 'UP'),
(3, 'Tinku', 'Punjab', 'Chandigarh'),
(4, 'Chahar', 'Lucknow', 'UP');
CREATE VIEW distinct_cities AS
SELECT DISTINCT city FROM customers; 
ALTER TABLE customers ADD total_orders INT;
UPDATE customers SET total_orders = 5 WHERE customer_id = 1;
UPDATE customers SET total_orders = 3 WHERE customer_id = 2;
UPDATE customers SET total_orders = 3 WHERE customer_id = 3;
UPDATE customers SET total_orders = 6 WHERE customer_id = 4;
SELECT * FROM distinct_cities;

Output:

DISTINCT in Views

Explanation: The VIEW fetched the unique city names. As DISTINCT already filters the unique city names by comparing them with customers. 

Best Practices for Using SELECT DISTINCT in SQL

  1. DISTINCT should be used when it is necessary. So, make sure that before using DISTINCT, there are no duplicates. 
  2. In many situations, DISTINCT with JOINS can be used to remove duplicates that occurred due to many-to-many relationships. 
  3. If the query has an index in it. Then using DISTINCT will improve the performance speed. 
  4. Using LIMIT with DISTINCT will reduce the processing time, as the LIMIT will reduce the size of the column. 
  5. If you want to use an aggregate function, prefer using GROUP BY over DISTINCT. 
  6. Avoid using SELECT DISTINCT * unless necessary. Always specify the columns to optimize query performance.”

What is the Difference Between GROUP BY and DISTINCT?

GROUP BY DISTINCT
It will only group the data that matches with each other. It will filter all the duplicates from the column.
It is used with the help of aggregate functions. This keyword is used for getting unique values.
It will group data for further calculations. DISTINCT removes duplicate rows by comparing values across selected columns, returning unique rows.
SELECT column1, COUNT(*)
FROM table_name
GROUP BY column1;
SELECT DISTINCT column1, column2
FROM table_name;

Real-World Examples

Case 1: List of customers located at different locations

Example:

CREATE TABLE ecommerce_customers (
    customer_id INT PRIMARY KEY,
    customer_name VARCHAR(50),
    email VARCHAR(100),
    states VARCHAR(50)
);
INSERT INTO ecommerce_customers (customer_id, customer_name, email, states) VALUES
(1, 'Ayaan', '[email protected]', 'USA'),
(2, 'Baskar', '[email protected]', 'Canada'),
(3, 'Charith', '[email protected]', 'USA'),
(4, 'Praveen', '[email protected]', 'UK'),
(5, 'Daku', '[email protected]', 'Canada');
SELECT DISTINCT states FROM ecommerce_customers;

Output:

distinct realworld eg1

Explanation: Here, the DISTINCT filtered all the unique cities where customers are located. 

Case 2: Listing the courses that Intellipaat is offering to students.

Example:

CREATE TABLE course_enrollments (
    student_id INT,
    course_id INT,
    course_name VARCHAR(100)
);
INSERT INTO course_enrollments (student_id, course_id, course_name) VALUES
(101, 1, 'Python Basics'),
(102, 1, 'Python Basics'),
(103, 2, 'Data Science'),
(101, 2, 'Data Science'),
(104, 3, 'Web Development');
SELECT DISTINCT course_name FROM course_enrollments;

Output:

Distinct_real_eg_2

Explanation: Here, the DISTINCT filtered out all the unique courses and removed the duplicate courses. 

SQL Unlocked: Learn for Free, Succeed for Life
Unlock the power of data with SQL and kickstart your career—absolutely free!
quiz-icon

Conclusion

In SQL, the SELECT DISTINCT statement is a keyword that is used to remove duplicates from the query result. This makes sure that there are no duplicates and all the data in the row and column is unique. This will be very helpful when you need to filter the data, for data analysis and reporting. DISTINCT should be handled carefully and should be used when it is necessary, as it may reduce the query performance. In this blog, you have learned about DISTINCT, when to use it, how to use it, and its performance. 

Take your skills to the next level by enrolling in the SQL Training Course today and gaining hands-on experience. Also, prepare for job interviews with SQL Interview Questions, prepared by industry experts.

Check out other related SQL blogs:


LIKE Query in SQL

Essential Features of SQL

SQL EXISTS

SQL BETWEEN Operator

LIKE and BETWEEN Operator in SQL

How to Alter Table in SQL: ADD, DROP, MODIFY, RENAME

SQL Server Data Types

Performance Tuning in Oracle

 

Frequently Asked Questions
Q1. What does SELECT DISTINCT do?

It returns unique (non-duplicate) rows based on the selected columns.

Q2. Does DISTINCT apply to all columns in the query?

Yes, it considers all selected columns together for uniqueness.

Q3. Is SELECT DISTINCT the same as GROUP BY?

No, DISTINCT removes duplicates, while GROUP BY is used for aggregation.

Q4. Can DISTINCT be used with ORDER BY or LIMIT?

Yes, it works fine with both to sort or limit the unique results.

Q5. Is SELECT DISTINCT * a good practice?

No, it is inefficient and you should select only the required columns to improve performance and clarity.

Our SQL Courses Duration and Fees

Program Name
Start Date
Fees
Cohort Starts on: 20th May 2025
₹15,048
Cohort Starts on: 27th May 2025
₹15,048

About the Author

Data Engineer, Tata Steel Nederland

As a skilled Data Engineer, Sahil excels in SQL, NoSQL databases, Business Intelligence, and database management. He has contributed immensely to projects at companies like Bajaj and Tata. With a strong expertise in data engineering, he has architected numerous solutions for data pipelines, analytics, and software integration, driving insights and innovation.