Back

Explore Courses Blog Tutorials Interview Questions
0 votes
4 views
in SQL by (20.3k points)

How can I delete duplicate rows where no unique row id exists?

My table is

col1  col2 col3 col4 col5 col6 col7

john  1    1    1    1    1    1 

john  1    1    1    1    1    1

sally 2    2    2    2    2    2

sally 2    2    2    2    2    2

I want to be left with the following after the duplicate removal:

john  1    1    1    1    1    1

sally 2    2    2    2    2    2

I've tried a few queries but I think they depend on having a row id as I don't get the desired result. For example:

DELETE

FROM table

WHERE col1 IN (

    SELECT id

    FROM table

    GROUP BY id

    HAVING (COUNT(col1) > 1)

)

1 Answer

0 votes
by (40.7k points)

Use, CTEs, and ROW_NUMBER as the combination of these will allow you to see which rows are deleted (or updated), Thus change the DELETE FROM CTE... to SELECT * FROM CTE as follows:

QUERY:

WITH CTE AS(

 SELECT [col1], [col2], [col3], [col4], [col5], [col6], [col7],

       RN = ROW_NUMBER()OVER(PARTITION BY col1 ORDER BY col1)

   FROM dbo.Table1

)

DELETE FROM CTE WHERE RN > 1

For more information, you can refer to DEMO

COL1    COL2    COL3    COL4    COL5    COL6    COL7

john           1             1             1            1             1       1

sally           2          2             2            2            2       2

This example results in duplicates by a single column col1 because of the PARTITION BY col1.

If you have to include multiple columns, then simply add them to the PARTITION BY:

ROW_NUMBER () OVER(PARTITION BY Column1, Column2, ... ORDER BY OrderColumn)

Related questions

Welcome to Intellipaat Community. Get your technical queries answered by top developers!

30.5k questions

32.6k answers

500 comments

108k users

Browse Categories

...