SEARCH

The How-To Geek Forums Have Migrated to Discourse

How-To Geek Forums / Microsoft Office

(Solved) - Find duplicate cells in Excel

(23 posts)
  • Started 1 year ago by 6card
  • Latest reply from moreeg
  • Topic Viewed 1749 times

6card
6card
Posts: 357

How can I detect and replace duplicate cells in a column? Perhaps a macro?

Would like to replace every dup with the same data entry such as "xxx", for example.

Thx, sixcard

Posted 1 year ago
Top
 
moreeg
moreeg
Posts: 842

Hi 6card

If your column containing duplicate values is sorted so that the duplicates are contiguous you can try this solution found by Enthusiast. If they aren't sorted or cannot be sorted let us know and we can see if we can build a macro solution.

Posted 1 year ago
Top
 
moreeg
moreeg
Posts: 842

Just because it's a slow Monday (Public Holiday here) here is a quick Macro to replace duplicates with XXXXX. It prompts for a column number.

Sub DelDup()

Dim lNum As Integer

On Error Resume Next

Application.DisplayAlerts = False

lNum = Application.InputBox _
(Prompt:="Please enter the Column you want to Dedupe e.g Column A = 1, B = 2 etc.", _
Title:="Column to Dedupe", Type:=1)

On Error GoTo 0

Application.DisplayAlerts = True

' Find the number of rows populated

LastRow = Cells(1, lNum).End(xlDown).Address

NoRows = Range(Cells(1, lNum).Address, LastRow).Count

For c = 1 To NoRows
For i = 1 To NoRows

If i = c Then GoTo 111

If Cells(i, lNum) = "XXXXX" Then GoTo 111

If Cells(i, lNum) = Cells(c, lNum) Then
Cells(i, lNum) = "XXXXX"
End If

111 Next i
Next c

End Sub

Posted 1 year ago
Top
 
6card
6card
Posts: 357

@moreeg

"Contiguous" was an early hiccup, I didn't catch that comment in your first post.

Now works very well!

Many thanks, sixcard

Posted 1 year ago
Top
 
Enthusiast
Enthusiast
Posts: 566

moreeg,

I have modified your macro slightly to allow the user to click on a cell in the column to remove duplicates from. It occured to me that they user, if working in column "GF1", would not have a clue what the column number is to input it. I also modified your loop structure to remove the GOTO, preferring to use nested IF structures for readability. See what you think.

Sub DelDup()
    Dim lNum As Integer
    Dim rngInput As Range
    On Error Resume Next
    Application.DisplayAlerts = False

    Set rngInput = Application.InputBox(Prompt:= _
        "Please click in the column to remove duplicates.", _
        Title:="SPECIFY COLUMN", Type:=8)

    lNum = rngInput.Column

    On Error GoTo 0
    Application.DisplayAlerts = True
    ' Find the number of rows populated
    LastRow = Cells(1, lNum).End(xlDown).Address
    NoRows = Range(Cells(1, lNum).Address, LastRow).Count
    For c = 1 To NoRows
        For i = 1 To NoRows
            If i <> c Then
                If Cells(i, lNum) <> "XXXXX" Then
                    If Cells(i, lNum) = Cells(c, lNum) Then
                        Cells(i, lNum) = "XXXXX"
                    End If
                End If
            End If
        Next i
    Next c
End Sub
Posted 1 year ago
Top
 
6card
6card
Posts: 357

@enthusiast Yes that mod is very useful on larger spreadsheets. Actually mine have 208 fields but I moved the dup ones (only 2) closer so I could just count them.

six

Posted 1 year ago
Top
 
moreeg
moreeg
Posts: 842

Hi Enthusiast

"if working in column "GF1"," ... that's why we got fingers and toes :) but I take your point - being able to click is much easier.

Your nested ifs are absolutely much more elegant (add bold. 16" Font, bright red for emphasis) and almost assuredly more efficient than my code but "easier to read"???. I suppose for programmers that is true but not, unfortunately, for me which is why I ended up with a bunch of random if statement. So thanks for that - hopefully I'll be able to produce code similar to yours next time.

In the end though, this macro is very dangerous stuff because it overwrites your data. If anything should go wrong the potential of losing data is very high. I would opt for the conditional formatting solution or if a macro is absolutely necessary then creating an empty adjacent column to mark a duplicate.

But most amazing of all that I'd like to know is ..... how did you preserve the indent in the post????

Posted 1 year ago
Top
 
6card
6card
Posts: 357

Hi Guys,

After running the macro, where I found the XXXXX, I would then sort and delete them. This did indeed loose some data. Is it possible to modify the data, e.g., change abcd not to xxxxx but rather to say, xxxxx_original data, i.e., xxxxx_abcd?

six

Posted 1 year ago
Top
 
moreeg
moreeg
Posts: 842

try this ...

Sub DelDup()
Dim lNum As Integer
Dim rngInput As Range
On Error Resume Next
Application.DisplayAlerts = False

Set rngInput = Application.InputBox(Prompt:= _
"Please click in the column to remove duplicates.", _
Title:="SPECIFY COLUMN", Type:=8)

lNum = rngInput.Column

On Error GoTo 0
Application.DisplayAlerts = True
' Find the number of rows populated
LastRow = Cells(1, lNum).End(xlDown).Address
NoRows = Range(Cells(1, lNum).Address, LastRow).Count
For c = 1 To NoRows
For i = 1 To NoRows
If i <> c Then
If Cells(i, lNum) <> "XXXXX" Then
If Cells(i, lNum) = Cells(c, lNum) Then
Cells(i, lNum) = "XXXXX_" & Cells(i, lNum)
End If
End If
End If
Next i
Next c
End Sub

Posted 1 year ago
Top
 
Enthusiast
Enthusiast
Posts: 566

moreeg,

I think I will start my responses with a disclaimer, "Always test others suggestions on a COPY of your spreadsheet. Keep your original safe."

As for the way to keep from having to use GOTO, consider this code:

.
.
if numRow = 13 Then GOTO 1111
    TotalSpent = TotalSpent + value(numRow)
1111 End If
.
.

In that code you are testing for a condition in which you DON'T want to add to the TotalSpent. However, if you invert that test to be a test for only what you DO want to add to TotalSpent, we can clean up the code, as in:

.
.
if numRow <> 13 Then
    TotalSpent = TotalSpent + value(numRow)
End If
.
.

We accomplish the same goal, however, by inverting any test that would result in requiring a GOTO, we can keep the code cleaner and more readable.

Posted 1 year ago
Top
 
nosparks
Posts: 148

@Enthusiast - moreeg's other question is also of interest to me:

But most amazing of all that I'd like to know is ..... how did you preserve the indent in the post????

Posted 1 year ago
Top
 
Enthusiast
Enthusiast
Posts: 566

@all,

That took me a long time to figure out. You bracket your code with "< code >" and "</ code >" (no quotes or spaces inside the quotes). That much we know. The important thing to do is after you "Send Post", you turn around and edit your post. Then simply "Send Post" again. I figured this out by attempting to fix a typo right after I submitted a post. After the Edit, I sent the post again and noticed the indents were where they should be.

Weird workaround, but as you see, it works. Have at it guys!

Posted 1 year ago
Top
 
nosparks
Posts: 148

WOW!
Would have never figured that out.
You should have left out how you stumbled upon this and remained a "Brilliant Genius" instead of lowering yourself to "Ordinary Genius".

Thanks alot, really appreciate that
NoSparks

Posted 1 year ago
Top
 
Enthusiast
Enthusiast
Posts: 566

LOL. Teach a man to fish...

Posted 1 year ago
Top
 
Xhi
Xhi
Posts: 6298

LOL. Teach a man to fish...
...and he'll probably get mercury poisoning and wind up on Medicaid.

Posted 1 year ago
Top
 
Enthusiast
Enthusiast
Posts: 566

Here is the non-destructive version, which simply highlights the duplicates:

Sub DelDup()
    Dim lNum As Integer
    Dim rngInput As Range
    On Error Resume Next
    Application.DisplayAlerts = False

    Set rngInput = Application.InputBox(Prompt:= _
        "Please click in the column to remove duplicates.", _
        Title:="SPECIFY COLUMN", Type:=8)

    lNum = rngInput.Column

    On Error GoTo 0
    Application.DisplayAlerts = True
    ' Find the number of rows populated
    LastRow = Cells(1, lNum).End(xlDown).Address
    NoRows = Range(Cells(1, lNum).Address, LastRow).Count
    For c = 1 To NoRows
        For i = 1 To NoRows
            If i <> c Then
                If Cells(i, lNum) <> "XXXXX" Then
                    If Cells(i, lNum) = Cells(c, lNum) Then
                        Cells(i, lNum).Select
                        With Selection.Interior
                            .Pattern = xlSolid
                            .PatternColorIndex = xlAutomatic
                            .Color = 65535
                            .TintAndShade = 0
                            .PatternTintAndShade = 0
                        End With
                    End If
                End If
            End If
        Next i
    Next c
End Sub
Posted 1 year ago
Top
 
moreeg
moreeg
Posts: 842
@Enthusiast -
             thanks for the tips. I'll refrain from making any philosophical connections
             about my "negative" approach to code
             versus your "positive" one but I must admit that the positive
             approach is intuitively cleaner and
             easier to understand.  

@6card
             If you want to delete the duplicates the easiest way
             is to use the "Remove Duplicates" in the Data Ribbon.
             It's dead simple to use and has the advantage of allowing
             you to "undo" in case the results are not
             as expected.

@Enthusiast
            Yep ----- <Code > works a treat

            what other HTML tags can we use? (I've been trying < b >< /b> but with no success)
Posted 1 year ago
Top
 
Enthusiast
Enthusiast
Posts: 566

@moreeg,

That is the only one I have found that works so far. I believe I originally tried it because it is a BB code.

"[ u ]" and "[ /u ]" work to underline

"[ quote ]" and "[ /quote ]" allow you to do quotations or notes.

I seem to remember an issue where posters were putting large fonts and different colors on their posts and the mods asked them to stop. The codes may have been inactivated at that time.

Posted 1 year ago
Top
 
moreeg
moreeg
Posts: 842

@Enthusiast
Your highlighting code works slightly differently from the original in that the original will only mark the true duplicates and leave the first occurrence unmarked. Either way it would be the user's preference how they want it but here is the code that will highlight only the duplicates and not the original ...

Sub DelDup2()
    Dim lNum As Integer
    Dim rngInput As Range
    On Error Resume Next
    Application.DisplayAlerts = False

    Set rngInput = Application.InputBox(Prompt:= _
        "Please click in the column to remove duplicates.", _
        Title:="SPECIFY COLUMN", Type:=8)

    lNum = rngInput.Column

    On Error GoTo 0
    Application.DisplayAlerts = True
    ' Find the number of rows populated
    LastRow = Cells(1, lNum).End(xlDown).Address
    NoRows = Range(Cells(1, lNum).Address, LastRow).Count
    For c = 1 To NoRows
        For i = 1 To NoRows
            If i <> c Then
                If Cells(c, lNum).Interior.Color <> 65535 Then
                    If Cells(i, lNum) = Cells(c, lNum) Then
                        Cells(i, lNum).Interior.Color = 65535
                    End If
                End If
            End If
        Next i
    Next c
End Sub
Posted 1 year ago
Top
 
Enthusiast
Enthusiast
Posts: 566

Good catch, I didn't replace the check for "XXXXX" with a check for the interior color change.

Posted 1 year ago
Top
 



Topic Closed

This topic has been closed to new replies.