When you are working with spreadsheets in Microsoft Excel and accidentally copy rows, or if you are making a composite spreadsheet of several others, you will encounter duplicate rows which you need to delete. This can be a very mindless, repetitive, time consuming task, but there are several tricks that make it simpler.
Today we will talk about a few handy methods for identifying and deleting duplicate rows in Excel. If you don’t have any files with duplicate rows now, feel free to download our handy resource with several duplicate rows created for this tutorial. Once you have downloaded and opened the resource, or opened your own document, you are ready to proceed.
Option 1 – Remove Duplicates in Excel
If you are using Microsoft Office Suite you will have a bit of an advantage because there is a built in feature for finding and deleting duplicates.
Begin by selecting the cells you want to target for your search. In this case, we will select the entire table by pressing “Control” and “A” at the same time (Ctrl + A).
Once you have successfully selected the table, you will need to click on the “Data” tab on the top of the screen and then select the “Remove Duplicates” function as shown below.
Once you have clicked on it, a small dialog box will appear. You will notice that the first row has automatically been deselected. The reason for this is that the “My data has headers” box is ticked.
In this case, we do not have any headers since the table starts at “Row 1.” We will deselect the “My data has headers” box. Once you have done that, you will notice that the whole table has been highlighted again and the “Columns” section changed from “duplicates” to “Column A, B, and C.”
Now that the entire table is selected, you just press the “OK” button to delete all duplicates. In this case, all the rows with duplicate information except for one have been deleted and the details of the deletion are displayed in the popup dialog box.
Option 2 – Advanced Filtering in Excel
The second tool you can use in Excel to Identify and delete duplicates is the “Advanced Filter.” This method also applies to Excel 2003. Let us start again by opening up the Excel spreadsheet. In order to sort your spreadsheet, you will need to first select all using “Control” and “A” as shown earlier.
After selecting your table, simply click on the “Data” tab and in the “Sort & Filter” section, click on the “Advanced” button as shown below. If you are using excel 2003, click on the “Data” drop down menu then “Filters” then “Advanced Filters…”
Now you will need to select the “Unique records only” check box.
Once you click on “OK,” your document should have all duplicates except one removed. In this case, two were left because the first duplicates were found in row 1. This method automatically assumes that there are headers in your table. If you want the first row to be deleted, you will have to delete it manually in this case. If you actually had headers rather than duplicates in the first row, only one copy of the existing duplicates would have been left.
Option 3 – Replace
This method is great for smaller spreadsheets if you want to identify entire rows that are duplicated. In this case, we will be using the simple “replace” function that is built into all Microsoft Office products. You will need to begin by opening the spreadsheet you want to work on.
Once it is open, you need to select a cell with the content you want to find and replace and copy it. Click on the cell and press “Control” and “C” (Ctrl + C).
Once you have copied the word you want to search for, you will need to press “Control” and “H” to bring up the replace function. Once it is up, you can paste the word you copied into the “Find what:” section by pressing “Control” and “V” (Ctrl + V).
Now that you have identified what you are looking for, press the “Options>>” button. Select the “Match entire cell contents” checkbox. The reason for this is that sometimes your word may be present in other cells with other words. If you do not select this option, you could inadvertently end up deleting cells that you need to keep. Ensure that all the other settings match those shown in the image below.
Now you will need to enter a value in the “Replace with:” box. For this example, we will use the number “1.” Once you have entered the value, press “Replace all.”
You will notice that all the values that matched “dulpicate” have been changed to 1. The reason we used the number 1 is that it is small and stands out. Now you can easily identify which rows had duplicate content.
In order to retain one copy of the duplicates, simply paste the original text back into the first row that has been replaced by 1’s.
Now that you have identified all the rows with duplicate content, go through the document and hold the “Control” button down while clicking on the number of each duplicate row as shown below.
Once you have selected all the rows that need to be deleted, right click on one of the grayed out numbers, and select the “Delete” option. The reason you need to do this instead of pressing the “delete” button on your computer is that it will delete the rows rather than just the content.
Once you are done you will notice that all your remaining rows are unique values.