It’s mostly recommended to use UNIQUE Constraints to prevent the duplicate rows. However, if you find a database with duplicate rows which is caused by human error or uncleaned data from an external source, then to find duplicate values in SQL table you need to follow these steps:
Step1:
Identify whether duplicate criteria are matching or not: First you need to decide, Are you looking for a duplicate in a single column or are you searching for a combination of two columns which can be made unique.
Step2: To verify whether the duplicates really exist or not:
Use this Query:
SELECT user_name, email, COUNT(*)
FROM users
GROUP BY user_name, email
HAVING COUNT(*) > 1
You can group both the columns.
Note: ‘HAVING’ clause is important to use because it filters the aggregate functions.
Refer to this video to learn how to use HAVING clause in detail.
Step 3: To find all rows which contain duplicates.
Use this Code:
SELECT a1.*
FROM users a1
JOIN (SELECT user_name, email, COUNT(*)
FROM users
GROUP BY user_name, email
HAVING count(*) > 1 ) b1
ON a1.user_name = b.user_name
AND a1.email = b1.email
ORDER BY a1.email
Refer to this video to learn GROUP BY syntax in detail.
If you observe here, you can see that the above query is not complicated. The first SELECT clause simply selects every column in the user's table. Inner Joins combine the duplicated data table from the first query. It’s necessary to use aliases (here, we’re using a1 & b1) to label the two versions as we’re joining the table to itself.