Clean & Look UpCore· 25 min read

Finding & Removing Duplicates

Duplicate rows quietly inflate your totals — Excel can spot them and remove them in a couple of clicks.

What you will learn

  • Understand why duplicates are a problem
  • Highlight duplicates to see them first
  • Remove duplicate rows safely

Why duplicates hurt

A duplicate is the same record entered more than once. They sneak in when data is copied, merged from two files, or typed twice. The danger is that they inflate your numbers: if an order appears twice, your sales total counts it twice and your report is wrong.

Picture three orders of 100, 200 and 100 — but the second 100 is an accidental duplicate of the first. SUM does not know it is a mistake; it faithfully adds every row:

Totalling sales of 100, 200 and a duplicated 100
=SUM(B2:B4)

Note: Output: 400 (wrong — it should be 300) The duplicate 100 was counted, pushing the total to 400 when the true figure is 300. This is exactly why you remove duplicates before you trust any total: the formula is correct, but the data was not.

Step 1: see them before you delete

It is wise to look at the duplicates before removing anything. Select your data, then go to Home > Conditional Formatting > Highlight Cells Rules > Duplicate Values. Excel colours every repeated value so you can eyeball them.

Imagine this customer list — notice Asha appears twice:

A
1Customer
2Asha
3Ben
4Asha
5Carl

Note: Output: After applying Highlight Duplicate Values, both Asha cells turn a colour (often light red). Nothing is deleted yet — you are just seeing the problem clearly first.

Step 2: remove them

When you are ready, click any cell in the table and go to Data > Remove Duplicates. Tick the column (or columns) that decide what counts as a duplicate, then press OK.

The menu path to remove repeated rows based on the Customer column
Data  >  Remove Duplicates  >  tick "Customer"  >  OK

Note: Output: 1 duplicate value found and removed; 3 unique values remain. Excel kept the first Asha, deleted the second, and left Ben and Carl untouched. Your list is now clean and your totals will be correct.

The whole flow, step by step

Removing duplicates is a delete action, so do it in a safe order every time. Here is the full flow from start to finish:

  1. Back up first. Right-click the sheet tab and choose Duplicate (or copy the data to a new sheet), so you can return to the original if something goes wrong.
  2. Highlight the duplicates with Home > Conditional Formatting > Highlight Cells Rules > Duplicate Values, and eyeball them so you know what is about to go.
  3. Click one cell inside the table (do not select a single column), so Excel works on the whole table together.
  4. Open the tool: Data > Remove Duplicates.
  5. Tick the right columns — only the columns that decide what counts as a duplicate (for example Customer and Order ID together), then press OK.
  6. Read the message Excel shows (for example “1 duplicate removed”) and check the remaining rows look correct.

Watch out: Remove Duplicates permanently deletes rows. Always keep a copy of the original data first (duplicate the sheet), so you can go back if you remove too much.

Tip: Be careful which columns you tick. If you only tick Customer, two different orders from the same customer look like duplicates. Tick enough columns (like Customer AND Order ID) so that only truly identical rows are removed.

Q. Before clicking Remove Duplicates, what is the safest first step?

Answer: Remove Duplicates deletes rows permanently, so making a backup copy first means you can recover if it removes more than you intended.

✍️ Practice

  1. Make a list of names with two obvious duplicates and highlight them with Conditional Formatting.
  2. Copy the sheet as a backup, then use Remove Duplicates and confirm only the repeats were deleted.

🏠 Homework

  1. Build a list of 10 orders where 2 are exact duplicates. Highlight them, total the sales before and after removing duplicates, and note how the total changed.
Want to learn this with a mentor?

CodingClave runs guided, project-based training (28-day, 45-day & 6-month batches).

Explore Training →