• ARTICLES
SEARCH

How-To Geek

How to Banish Duplicate Photos with VisiPic

2012-06-26_152259

You meant well, you intended to be a good file custodian, but somewhere along the way things got out of hand and you’ve got duplicate photos galore. Don’t be afraid to delete them and lose important photos, read on as we show you how to clean safely.

Deleting duplicate files, especially important ones like personal photos, makes a lot of people quite anxious (and rightfully so). Nobody wants to be the one to realize that they deleted all the photos of their child’s first birthday party during a hard drive purge gone wrong.

In this tutorial we’re going to show you how to go beyond the limited reach of  tools which simply compare file names and file sizes. Instead we’ll be using a program that combines that kind of comparison with actual image analysis to help you weed out not just perfect 1:1 file duplicates but also those piles of resized for email images, cropped images, and other modified images that might be cluttering up your hard drive.

What Do I Need?

For the following tutorial you’ll need the following tools:

  • Visipics (Windows XP or above / WINE compatible)
  • An internal or external hard drive to backup the entire collection you’ll be cleaning

We can’t emphasize the second entry in the list enough; it’s reckless to unleash any file-weeding application upon your files without a proper backup in place to restore files in case of error (user, application, or otherwise).

Backing Up Your Files and Best Practices

2012-06-26_152714

We just mentioned this, but it’s important enough to merit a separate entry in the guide. You must backup your files before continuing. Ideally this means copying all your image directories (no matter how cluttered or poorly organized they are) onto an external hard drive which can be disconnected from the primary machine during the image weeding process. At minimum you should at least copy the image directories to another hard drive within your machine and/or to another directory on the disk you’re working on.

Whatever you choose to do (or can do, based on the hardware you have on hand) you should not proceed unless there is, at minimum, a copy of every photo you’re working with in a location that will not be touched by the application we’re using.

In addition to making sure you’re only working with one set of files (and the other is properly backed up) the other critical thing you want to do is to decide which directory is going to be the home directory and which directory is going to be the dupe directory.

Let’s say, for example, that you have a pile of photos in C:\Pictures\ and C:\Picture Dump\. Any duplicate file finder you use will find the dupes in either directory. What you don’t want to do is to start deleting duplicates from both directories as this breaks apart the sets/collections you have.

If there is a folder called 2011 Birthday in both folders, with the same files in both folders, if you don’t pay attention to the process and delete 5 dupes from the first 2011 Birthday folder and 5 dupes from the second one, you’ll end up with a split collection that is even messier than the original pile of dupes you had on your hands.

Always check to see if there is a cluster of duplicate files and remove as many of them as you can, from the duplicate directory, while leaving the home directory’s files intact. This way, when you’re done, you’ll have the lest amount of work to do reincorporating the lost files in the secondary directory into your now dupe-free and mostly clean home directory.

Before continuing, ensure your files are backed up and that you have established which directory is going to be your home directory—the place where the files will remain untouched while the duplicates elsewhere will be purged.

Install and Configure VisiPics

2012-06-26_142506

VisiPics is a small, free, and easy to install app. Simply download it, run the installer, and accept the license agreement. Once the installer is done the application will launch.

To configure VisiPics you need to specify which directories you wish to scan and how strictly you wish VisiPics to compare the files. Visipics is not a simple duplicate file-finder—it doesn’t restrict itself to simply comparing names, file sizes, or file hashes. Visipics specifically uses image analysis algorithms to compare photos and will (depending on the settings you select) even offer two photos as duplicates that are different sizes and resolutions but otherwise the same image.

First, let’s pick our directories. For the purpose of this demonstration we’ll be selecting two directories that we know have duplicate files in them. In our My Documents folder we have a folder called \Picture Dump\. We took this folder and copied the images to the E:\ drive to create our duplicate set. By clicking on File –> Add Folder (or by using the folder browser pane and the Add Arrow button) we can easily add the two folders to VisiPics like so:

2012-06-26_145325

Now would be a good time to mention that VisiPics has a Project function which allows you to save all your settings in between sessions. If you’ve spent a bit of time selecting folders (or later, tweaking settings), you’ll definitely want to take a moment to go to  File –> Save Project and secure the resulting VSP project file in a place it won’t get accidently deleted.

Once you have your folders selected, you can then move the folders up or down in the list in order to create prioritization for the auto-select tool. Your home directory should be the directory at the top—use the up and down arrows at the right side of the folder list to change the position of the folders. You can see the rules for Auto-Select by clicking on the Auto-Select tab. The default is to select uncompressed files, lower resolution files, and smaller files, first. You can uncheck any of these options to alter the behavior of the duplicate finder. Note: Auto-Select will never actually automatically select files unless you click the Auto-Select button.

Once you have the directories picked out and prioritized,  you can run your initial test run. No files will be deleted, this test run will simply allow you to see if you need to adjust your filter settings for better results. Go ahead and press the green play arrow in the middle of the interface panel to begin the process. Depending on how many files you have this may take anywhere from a few minutes to an hour or more with large 20,000+ file collections.

2012-06-26_144321

In the case of our test run, we have two directories. One on the C drive and one on the E drive. We purposely altered some of the files on the E drive (reduced the file size, altered the dimensions, and so on) to double check Visipics’ search algorithms. Visipic found all the duplicate files, including the files with different sizes, resolutions, and file names.

2012-06-26_145930

More importantly, when we used the Auto-Select button, it accurately picked out the duplicate files from the non-prioritized directory first while still respecting the Auto-Select rules that instructed it to also flag the lower-quality files for deletion like so:

2012-06-26_150323

Now that you have your files scanned, and you’ve hit Auto-Select to see the files that are VisiPics’ best choices, you have several options. You can bulk delete or move the fills all at once by clicking the Move and Delete buttons in the Actions section located on the right hand side of the interface. We’d, however, recommend not firing off with the Delete button unless you’ve taken a moment to look over the results and confirm that the files are the ones you want deleted.

Move allows you to take all the duplicate files and move them somewhere new, essentially creating a backup of the dupes. If you’ve pretty sure VisiPics has selected the best files but you want to error on the side of caution, move the files to a secondary directory or drive.

Finally, the safest way to use Visipics (although it is by far the most time consuming) is to go down the list and check each file by hand. While this is the surest way to ensure there are no accidental deletions, on a large collection it is very time consuming. If you’re trying to sort out a mess of 15,000 duplicate photos we’d recommend using the Move function to back them up (or rely on the original backup you created earlier in the tutorial) and simply check the first few hundred images to ensure Visipics has sorted them according to your settings—after the initial check, let the application handle deleting the dupes.

If you do opt to hand-check the entire list of files, we’d strongly suggest taking advantage of the previously mentioned Save Project function so that you can save the entire process at any point and return to it later without having to rescan or reflag your photos.

Regardless of how much hand-checking or automation you use, when you’re done you’ll have a tidied directory with the highest quality versions of your images—without a duplicate in sight.


Have a tip, trick, or tool for ferreting out duplicate files? Share your knowledge in the comments below.

Jason Fitzpatrick is warranty-voiding DIYer and all around geek. When he's not documenting mods and hacks he's doing his best to make sure a generation of college students graduate knowing they should put their pants on one leg at a time and go on to greatness, just like Bruce Dickinson. You can follow him on if you'd like.

  • Published 06/26/12

Comments (16)

  1. tony

    Everytime I start this process, I stop because I just know I’ll end up deleting something I should not have deleted. And as much as I don’t have confidence in automatically deleting files, I also don’t have the patience to look through them all by hand. I’ll take a deep breath and give it another go. What’s the worse that could happen?

  2. David Aris-Sutton

    @tony with the relative low cost of hard drive space would it not make sense just to keep the duplicates? it may be a little untidy but so what? Its better than losing a picture you treasure

  3. Dave

    When you allow the program to move the files, does it preserve the folder structure?

    So if you have

    Pictures > Parties > Xmas etc, will it move the file to the same folder on the destination?

  4. TheFu

    After the fact, it is really hard to come back and organize photos. Starting off organized is much easier.

    ~/Pictures/{YYYY}/{MM}-{Event_name}/{number}-{photo_description}.jpg

    I have tens of thousands of photos organized this way. The {number} is simply to keep the photos in order. This helps relive the entire experience later. Eventually, the built-in file time stamp will be incorrect, so you can’t trust that. EXIF data is better, but most people never keep it corrected.

    For vacation photos, consider replacing {MM} to {MMDD} to keep days in order.

    Programs come and go, so I’ve never been able to trust them to handle the organization. I trust file systems much more.
    Backup, backup, backup.

  5. Keith6286

    This sounds like the program l have been looking for. On the wiki it only mentions 32bit support, is there a program as good as VisPic that will work in Win 7 64bit?

  6. J.R.Frye

    I do not now have the extra hard drive to accompany this program @ this time. How long will I have access to these programs to unduplicate my computer. I have some pics and music that have been saved from the old DOS days, and frankly it scares me a little to try to partition a new drive onto my main.

  7. MikeMoss

    Just go out and buy an external hard drive.

    No one should have anything they don’t want to lose in only one location or even on one hard drive in two different partitions.

    I can tell you that hard drives do fail, I had a friend who lost his hard drive a few years ago taking everything he had saved for years with it.

    He never backed anything up to an external drive.

    They are pretty cheap any more and well worth the cost.

  8. robertm

    one day…. i was typing along…… and my computer crashed
    i restarted
    and all i heard was click click click click

    my harddrive died. I still miss the data that was contained on it.

  9. sandy

    To recover almost anything – go to Christophe Grenier’s PHOTOREC software – cgsecurity.com

  10. Maglor

    Awsome Duplicate Photo Finder is also a good alternative.
    Free and nice big pictures to compare them.

    http://www.duplicate-finder.com/photo.html

  11. Phylis Sophical

    I can highly recommend this program. Been using it for years. Very easy to use. Deletes are sent to Recycle bin so if you goof, you can still recover them. Shows exactly what folder each duplicate is in. Doesn’t delete folders, only files. Really good (simple) help file.

  12. howard10

    like Ray replied I am amazed that a mother able to profit $4635 in 1 month on the computer. did you read this page N U T T Y R | C H . C O M

  13. Noel

    I went through this process few days ago and searched up and down the cyberworld to find right piece of software. Many a places said VisiPics is best, yes its good but super slow, memory hogging and gives you lot less results than actually they are. I found switching between the duplicates i.e. finding which one is in which directory (so that I delete the right one) was very difficult (and slow). Overall, I gave thumbs down to this program.

    Instead, give a try to Awesome Duplicate Photo Finder from http://www.duplicate-finder.com First of all, this author needs to give some different name to this application as it sounds like another crappy software or bloatware or even a trojan. I took a chance and used it, trust me, its super fast, gives you way more results than VisiPics and very easy to see the file path with size.

    To search my 60GB photo library, (mostly 7 and 10MP photos), Visipics took 3 full days, whereas Awesome Photo Finder took about 12 hours or so. Give a try, I am sure a user will love it.

  14. catester

    I am so glad to see your recommendation to backup all the pictures before starting this process. I can’t tell you how many times I’ve had to go rescue customers’ important photos because when they were asked “Are you sure?” they answered “Yes.” It is entirely possible to be completely sure…but also wrong. Thanks for putting the backup recommendation above the fold!

  15. Pipo

    Nice tutorial, I’ve tried other duplicate cleaners before, but as Jason said it only compares file names, file sizes, or file hashes. Definitely going to be adding this on my to-do list this coming weekend.

    Any chance you’ll be doing something like this for music files next?

  16. Ray

    I’ve used it before, it was wonderful. I also want to thank those guys for stressing back-ups before action. I am still amazed when friends call me to help them out and I can’t because they never backed anything up. Jeez Louise, drives are so cheap these days, and even if you think you can’t afford a 40 dollar hard drive think about the cost when you lose all those precious photos.

    Great comments.

Get Free Articles in Your Inbox!

Join 134,000 newsletter readers

Email:

Go check your email!