Want to see the text inside a binary or data file? The Linux
strings command pulls those bits of text—called “strings”—out for you.
Linux is full of commands that can look like solutions in search of problems. The
strings command definitely falls into that camp. Just what is its purpose? Is there a point to a command that lists the printable strings from within a binary file?
Let’s take a step backward. Binary files—such as program files—may contain strings of human-readable text. But how do you get to see them? If you use
less you are likely to end up with a hung terminal window. Programs that are designed to work with text files don’t cope well if non-printable characters are fed through them.
Most of the bytes within a binary file are not human readable and cannot be printed to the terminal window in a way that makes any sense. There are no characters or standard symbols to represent binary values that do not correspond to alphanumeric characters, punctuation, or whitespace. Collectively, these are known as “printable” characters. The rest are “non-printable” characters.
So, trying to view or search through a binary or data file for text strings is a problem. And that’s where
strings comes in. It extracts strings of printable characters from files so that other commands can use the strings without having to contend with non-printable characters.
Using the strings Command
There’s nothing complicated about the
strings command, and its basic use is very simple. We provide the name of the file we wish
strings to search through on the command line.
Here, we going to use strings on a binary file—an executable file—called “jibber.” We type
strings, a space, “jibber” and then press Enter.
The strings are extracted from the file and listed in the terminal window.
Setting the Minimum String Length
By default, strings will search for strings that are four characters or longer. To set a longer or shorter minimum length, use the
-n (minimum length) option.
Note that the shorter the minimum length, the higher the chances you will see more junk.
Some binary values have the same numerical value as the value that represents a printable character. If two of those numerical values happen to be side by side in the file and you specify a minimum length of two, those bytes will be reported as though they were a string.
strings to use two as the minimum length, use the following command.
strings -n 2 jibber
We now have two-letter strings included in the results. Note that spaces are counted as a printable character.
Piping strings Through Less
Because of the length of the output from
strings, we’re going to pipe it through
less. We can then scroll through the file looking for text of interest.
strings jibber | less
The listing is now presented for us in
less, with the top of the listing displayed first.
Using strings with Object Files
Typically, program source code files are compiled into object files. These are linked with library files to create a binary executable file. We have the jibber object file to hand, so let’s have a look inside that file. Note the “.o” file extension.
jibber.o | less
The first set of strings are all wrapped at column eight if they are longer than eight characters. If they have been wrapped, an “H” character is in column nine. You may recognize these strings as SQL statements.
Scrolling through the output reveals that this formatting is not used throughout the file.
It is interesting to see the differences in the text strings between the object file and the finished executable.
Searching In Specific Areas in the File
Compiled programs have different areas within themselves that are used to store text. By default,
strings searches the entire file looking for text. This is just as though you had used the
-a (all) option. To have strings search only in initialized, loaded data sections in the file, use the
-d (data) option.
strings -d jibber | less
Unless you have a good reason to, you might as well use the default setting and search the whole file.
Printing the String Offset
We can have
strings print the offset from the start of the file at which each string is located. To do this, use the
-o (offset) option.
strings -o parse_phrases | less
The offset is given in Octal.
To have the offset displayed in a different numerical base, such as decimal or hexadecimal, use the
-t (radix) option. The radix option must be followed by
x (hexadecimal), or
o (Octal). Using
-t o is the same as using
strings -t d parse_phrases | less
The offsets are now printed in decimal.
strings -t x parse_phrases | less
The offsets are now printed in hexadecimal.
strings considers tab and space characters to be part of the strings it finds. Other whitespace characters, such as newlines and carriage returns, are not treated as though they were part of the strings. The
-w (whitespace) option causes strings to treat all whitespace characters as though they are parts of the string.
strings -w add_data | less
We can see the blank line in the output, which is a result of the (invisible) carriage return and newline characters at the end of the second line.
We’re Not Limited to Files
We can use
strings with anything that is, or can produce, a stream of bytes.
With this command, we can look through the random access memory (RAM) of our computer.
We need to use
sudo because we’re accessing /dev/mem. This is a character device file which holds an image of the main memory of your computer.
sudo strings /dev/mem | less
The listing isn’t the entire contents of your RAM. It is just the strings that can be extracted from it.
Searching Many Files At Once
Wildcards can be used to select groups of files to be searched. The
* character represents multiple characters, and the
? character represents any single character. You can also choose to provide many filenames on the command line.
We’re going to use a wildcard and search through all of the executable files in the /bin directory. Because the listing will contain results from many files, we will use the
-f (filename) option. This will print the filename at the start of each line. We can then see which file each string was found in.
We’re piping the results through grep, and looking for strings that contain the word “Copyright.”
strings -f /bin/* | grep Copyright
We get a neat listing of the copyright statements for each file in the /bin directory, with the name of the file at the start of each line.
There’s no mystery to strings; it is a typical Linux command. It does something very specific and does it very well.
It’s another of Linux’s cogs, and really comes to life when it is working with other commands. When you see how it can sit between binary files and other tools like
grep, you start to appreciate the functionality of this slightly obscure command.