Quick Links
Converting a PDF file to an image can be done easily at the Linux command line using a single command. Discover how to do install the utility, how to use it, and how to automate your setup.
What Is poppler-utils ?
As alluded to in the introduction for this article, we need to install a small utility set named poppler-utils to help us convert PDF files to images.
The poppler-utils utility set allows us to convert images to PDF, and PDF to images.
Installing poppler-utils
To install poppler-utils on your Debian/Apt based Linux distribution (Like Ubuntu and Mint), do:
sudo apt install poppler-utils
To install poppler-utils on your RedHat/Yum based Linux distribution (Like RedHat and Fedora), do:
sudo yum install poppler-utils
Converting PDF to images
The command required is simple and straightforward:
pdftoppm -png test.pdf test
With the pdftoppm
command we can convert PDF to images. We specify that we want a PNG file for the output format (by using -png
) and that our input file is test.pdf
.
The output file we specify as test
. pdftoppm
will automatically add a page number suffix (like -1) and an extension (based on the earlier -png
option passed).
The output file name will thus be test-1.png
, as we can verify next:
ls test-1.pngeog test-1.png
Any subsequent pages would be test-2.png etc. The eog
command (if eog is installed) will open the file for you so you can review the output, though you can use any other image handling program you like.
Batch Processing of PDF Files to Images
We can make a one-liner command to do batch processing of all PDF files with a given name to images. We could then simply add this line to a small script .sh file and automate it further, or we can just use it at the command line whenever we need to convert a large amount of PDF files to images.
ls --color=never test*.pdf | sed 's|.pdf||' | xargs -I{} pdftoppm {}.pdf -png {}
In this command, we first obtain a directory listing for all PDF files which have a name that starts with test and ends with .pdf, using the ls --color=never test*.pdf
.
The --color=never
is important, as shell color coding symbols (if active, as they are by default) may sometimes confuse xargs.
Next we use a simple sed
substitute command to replace a literal dot followed by pdf to nothing. In other words, we remove the .pdf file extension.
This gives us the benefit of adding it back later only where needed, i.e. when specifying the input file for pdftoppm
, but not when specifying the output file for the same pdftoppm
command, much alike to our earlier example above.
Finally, we use xargs
to sent each pdf filename (minus the .pdf) to pdftoppm
one by one. We use the -I
option to xargs
which allows us to specify any input received (i.e. the shortened pdf filenames) by simply using {}
in the command that follows.
As you can see, our pdftoppm
command now looks much alike to the first example, with each individual pdf file name as input (re-suffixed with .pdf), and as output the pdf filename without .pdf.
Let's execute it:
This worked fine: the three PDF files, all with one page each, were converted to three individual .png files (one image per page and in this case per PDF as each PDF had only one page), all aptly named and suffixed correctly.
As an alternative to the -png
option, one can also use -jpeg
to generate JPEG files instead. Use pdftoppm --help
or man pdftoppm
to see a full list of options.
Wrapping up
In this article we saw how easy and straightforward it can be to convert PDF files to image files, and that directly from the Linux command line! We also look at a straightforward way to automate this process. Enjoy!