Back

Explore Courses Blog Tutorials Interview Questions
0 votes
4 views
in SQL by (20.3k points)

How can I delete duplicate rows where no unique row id exists?

My table is

col1  col2 col3 col4 col5 col6 col7

john  1    1    1    1    1    1 

john  1    1    1    1    1    1

sally 2    2    2    2    2    2

sally 2    2    2    2    2    2

I want to be left with the following after the duplicate removal:

john  1    1    1    1    1    1

sally 2    2    2    2    2    2

I've tried a few queries but I think they depend on having a row id as I don't get the desired result. For example:

DELETE

FROM table

WHERE col1 IN (

    SELECT id

    FROM table

    GROUP BY id

    HAVING (COUNT(col1) > 1)

)

1 Answer

0 votes
by (40.7k points)

Use, CTEs, and ROW_NUMBER as the combination of these will allow you to see which rows are deleted (or updated), Thus change the DELETE FROM CTE... to SELECT * FROM CTE as follows:

QUERY:

WITH CTE AS(

 SELECT [col1], [col2], [col3], [col4], [col5], [col6], [col7],

       RN = ROW_NUMBER()OVER(PARTITION BY col1 ORDER BY col1)

   FROM dbo.Table1

)

DELETE FROM CTE WHERE RN > 1

For more information, you can refer to DEMO

COL1    COL2    COL3    COL4    COL5    COL6    COL7

john           1             1             1            1             1       1

sally           2          2             2            2            2       2

This example results in duplicates by a single column col1 because of the PARTITION BY col1.

If you have to include multiple columns, then simply add them to the PARTITION BY:

ROW_NUMBER () OVER(PARTITION BY Column1, Column2, ... ORDER BY OrderColumn)

Related questions

Browse Categories

...