Detect and remove duplicates in SQL Server

When working with large volumes of data, it happens to have duplicates at the tables level in the database.
Therefore. it could lead to exceptions in the code and incorrect statistics.
In the following lines we show you how to detect duplicates and how to remove them.

duplicates

How to manage duplicates problem in SQL Server

 

Example:

ID          LastName         FirstName
----------- ---------------- --------
33          SMITH            John
34          SMITH            John
35          SMITH            John

A) Execute duplicates search query on a given table

SELECT   COUNT(*) AS NumberOfDuplicates, field1, field2, field3
 FROM     [TableName]
 GROUP BY field1, field2, field3
 HAVING   COUNT(*) > 1

B) Remove duplicates

We shall proceed by the difference of the ID field values.
In our example we will remove entries with ID 34 and 35.

-- Solution 1
 DELETE table FROM table
 LEFT OUTER JOIN (
 SELECT MIN(id) as id, field1, field2, field3
 FROM table
 GROUP BY field1, field2, field3
 ) as t1
 ON  table.id = t1.id
 WHERE t1.id IS NULL

Or

-- Solution 2
 DELETE t1 
 FROM table AS t1, table AS t2
 WHERE t1.id > t2.id
 AND t1.column1 = t2.column1
 AND t1.column2 = t2.column2
 AND t1.column3 = t2.column3

Print Friendly, PDF & Email

Leave a Reply