How to Find and Remove Duplicate Files on Linux

By Chris Hoffman on November 10th, 2014

find_duplicate_files_on_linux3

Whether you’re using Linux on your desktop or a server, there are good tools that will scan your system for duplicate files and help you remove them to free up space. Solid graphical and command-line interfaces are both available.

Duplicate files are an unnecessary waste of disk space. After all, if you really need the same file in two different locations you could always set up a symbolic link or hard link, storing the data in only one location on disk.

FSlint

FSlint is available in various Linux distributions’ software repositories, including Ubuntu, Debian, Fedora, and Red Hat. Just fire up your package manager and install the “fslint” package. This utility provides a convenient graphical interface by default, but it also includes command-line versions of its various functions. Like many Linux applications, the FSlint graphical interface is just a front-end that uses the FSlint commands underneath.

Don’t let that scare you away from using FSlint’s convenient graphical interface, though. By default, it opens with the Duplicates pane selected and your home directory as the default search path. All you have to do is click the Find button and FSlint will find a list of duplicate files in directories under your home folder. Use the buttons to delete any files you want to remove, and double-click them to preview them.

Note that the command-line utilities aren’t in your path by default, so you can’t run them like typical commands. On Ubuntu, you’ll find them under /usr/share/fslint/fslint. So, if you wanted to run the entire fslint scan on a single directory, here are the commands you’d run on Ubuntu:

cd /usr/share/fslint/fslint

./fslint /path/to/directory

This command won’t actually delete anything. It will just print a list of duplicate files — you’re on your own for the rest.

fdupes

The fdupes command isn’t usually installed by default, but it’s available in many Linux distribution’s repositories. It’s a simple command-line tool. This is probably the most convenient, quickest tool you can use if you want to find duplicate files in an environment where you only have access to a Linux command line, not a graphical user interface.

Using it is simple. Just run the fdupes command followed by the path to a directory. So, fdupes /home/chris would list all duplicate files in the directory /home/chris — but not in subdirectories! The fdupes -r /home/chris command would recursively search all subdirectories inside /home/chris for duplicate files and list them.

This tool won’t automatically remove anything, it will just show you a list of duplicate files. You can then delete the duplicate files by hand, if you like. You can also run the command with the -d switch to have it help you delete files. You’ll be prompted to choose the files you want to preserve.

dupeGuru, dupeGuru Music Edition, and dupeGuru Pictures Edition

Yes, we’re going to recommend dupeGuru once again. It’s an open-source and cross-platform tool that’s so useful we’ve already recommended it for finding duplicate files on Windows and cleaning up duplicate files on a Mac.

dupeGuru is a bit less convenient because it’s not available in most Linux distributions’ software repositories — although it is available in Arch Linux’s repositories. However, the dupeGuru website offers a PPA that lets you easily install their software packages on Ubuntu and Ubuntu-based Linux distributions. Users of other Linux distributions could even compile it from source.

As on Windows and Mac, dupeGuru offers three different editions — a standard edition for basic duplicate-file-scanning, an edition designed for finding duplicate songs that may have been ripped or encoded differently, and an edition intended for finding similar photos that have been rotated, resized, or otherwise modified. You can get them all from the dupeGuru website, and all three are available in the Ubuntu PPA.

This application works just as it does on other platforms. Launch it, add one or more folders to scan, and click Scan. You’ll see a list of duplicate files, and you can check them off and remove them — or move them to other platforms. You can also easily open and examine the file with a double-click.

After installation, the Ubuntu package must be launched from a command line — for example, with the dupeguru_se command for the standard edition. There appears to be no desktop shortcut installed by default. This lack of system integration is the only reason we can’t recommend this utility more highly, as it works well once you get it installed and launched.


As you might expect, this isn’t a complete list. You’ll find many other duplicate-file-finding utilities — mostly commands without a graphical interface — in your Linux distribution’s package manager. Unless you have specific needs, the above tools are our favorites and the ones we recommend.

Chris Hoffman is a technology writer and all-around computer geek. He's as at home using the Linux terminal as he is digging into the Windows registry. Connect with him on Google+.

  • Published 11/10/14
More Articles You Might Like

Enter Your Email Here to Get Access for Free:

Go check your email!