Delete Duplicate Records in SQL Server (Simple Guide)
🧹 How to Delete Duplicate Rows in SQL Server (Step-by-Step for Beginners)
This simple guide shows you how to remove duplicate rows from a SQL Server table using a CTE (Common Table Expression).
🎯 Step 1: Understand your table
Imagine you have a table named Employees that looks like this:
| EmployeeID | FirstName | LastName | |
|---|---|---|---|
| 1 | John | Doe | john@example.com |
| 2 | John | Doe | john@example.com |
| 3 | Jane | Smith | jane@example.com |
| 4 | Jane | Smith | jane@example.com |
Here, John Doe and Jane Smith each appear twice. We want to keep only one copy of each.
⚙️ Step 2: Find duplicates
We’ll use a helper query called a CTE to find duplicate rows:
WITH CTE AS (
SELECT
EmployeeID,
ROW_NUMBER() OVER (
PARTITION BY FirstName, LastName, Email
ORDER BY EmployeeID
) AS rn
FROM Employees
)
SELECT * FROM CTE;
This does the following:
- Groups rows that have the same FirstName, LastName, and Email.
- Gives each one a number (
rn= row number). - The first one gets
rn = 1(we’ll keep it), and others getrn = 2,3, etc. (we’ll delete them).
| EmployeeID | rn |
|---|---|
| 1 | 1 |
| 2 | 2 |
| 3 | 1 |
| 4 | 2 |
🧹 Step 3: Delete the duplicates
WITH CTE AS (
SELECT
EmployeeID,
ROW_NUMBER() OVER (
PARTITION BY FirstName, LastName, Email
ORDER BY EmployeeID
) AS rn
FROM Employees
)
DELETE FROM CTE WHERE rn > 1;
This deletes all rows where rn > 1, meaning all the duplicate copies.
✅ Step 4: Check the result
SELECT * FROM Employees;
You should now see only one of each record:
| EmployeeID | FirstName | LastName | |
|---|---|---|---|
| 1 | John | Doe | john@example.com |
| 3 | Jane | Smith | jane@example.com |
💡 Notes:
- Always make a backup before deleting any data.
- If you just want to see duplicates first, replace
DELETEwithSELECT. - CTE (Common Table Expression) is just a temporary helper used to organize your SQL.
Comments
Post a Comment