How To Show A File's Printable Characters With The Strings Command

Man in cafe works on laptop
Reza Estakhrian / Getty Images

Introduction

Have you ever tried to open a file in an editor only to find out that it contains unreadable binary content? 

The Linux "strings" command makes it possible to view the human readable characters within any file. 

The main purpose of using the "strings" command is to work out what type of file it is you are looking at but you can also use it to extract text. For instance if you have a file from a proprietary program which saves files in a strange binary format you can use "strings" to extract the text you put into the file.

Example Usage Of The Strings Command

A great way to demonstrate the power of the strings command is to create a document using LibreOffice Writer.

Simply open LibreOffice Writer and enter some text and then save it in the standard ODT format.

Now open a terminal window (press CTRL, ALT and T at the same time) and then use the cat command to display the file as follows:

cat yourfilename.odt | more

(Replace the yourfilename.odt with the name of the file you created)

What you will see is a whole wall of illegible text.

Press the space bar to scroll through the file. Sporadically throughout the file you will see some of the text you have entered.

The strings command can be used to display just the parts that are human readable.

In its simplest form you can run the following command:

strings yourfilename.odt | more

As before a wall of text will appear but only text that you can read as a human. If you are lucky then you will be able to see your text but to be honest it is still a bit hit and miss.

What you will be able to see that is key however is on the first line:

mimetypeapplication/vnd.oasis.opendocument.text

We know that the file type is a LibreOffice Writer ODT file for 2 reasons:

  1. We created the file
  2. The extension is .ODT

Imagine that you didn't create the file or you found the file on a recovered disk and the file did not have an extension.

Windows recovery would often recover files with names like 0001, 0002, 0003 etc. The fact that the files were recovered is great but trying to work out what the types of those files were was a nightmare.

By using strings you have a fighting chance of working out the file type. Knowing that a file is a opendocument.text file means you can save it with the ODT extension and open it in LibreOffice writer.

In case you were unaware an ODT file is basically a compressed file. If you rename yourfilename.odt to yourfilename.zip you can open it in an archiving tool and even unzip the file.

Alternative Behaviours

By default the strings command returns all strings within a file but you can switch the behaviour so that it returns strings from initialised, loaded data sections in a file.

What does this mean exactly? Nobody seems to know. 

I would assume that you are using strings to try and either find out the file type or to look for specific text in a file.

If when running the strings command using the default behaviour you don't get the output you were hoping for then try running one of the following commands to see if it makes a difference:

strings -d yourfilename

strings --data yourfilename

The manual page states that the above command may help to reduce the amount of garbage returned from strings.

Personally I have found little use for it.

The "strings" command can be set up to work in reverse so that the minus d switch is the default behaviour. If this is the case on your system then you can return all the data by using the following command:

strings -a yourfilename

 

Formatting Output

You can get the text within the output to display the name of the file alongside each line of text.

To do this run one of the following commands:

strings -f yourfilename

strings --print-file-name yourfilename

The output will now look something like this:

yourfilename: a piece of text

yourfilename: another piece of text

As part of the output you can also display the offset of where that text appears in a file. To do so run the following command:

strings -o yourfilename

The output will look something like this:

16573 your

17024 text

The offset is actually the octal offset although depending on how strings has been compiled for your system it could easily be the hex or the decimal offset as well.

A more accurate way of getting the offset you want is to use the following commands:

strings -t d yourfilename

strings -t o yourfilename

strings -t h yourfilename

The minus t means return the offset and the character that follows determines the offset type. (i.e. d = decimal, o = octal, h = hex).

By default the strings command prints each new string on a new line but you can set the delimiter of your choice. For example to use a pipe symbol ("|") as the delimiter run the following command:

strings -s "|" yourfilename

Adjust The String Limit

The strings command by default looks for a string of 4 printable characters in a row. You can adjust the default so that it only returns a string with 8 printable characters or 12 printable characters.

By adjusting this limit you can tailor the output to get the best possible result. By looking for a string that is too long you risk omitting useful text but by making it too short you might end up with far more junk returned.

To adjust the string limit run the following command:

strings -n 8 yourfilename

In the above example I have changed the limit to 8. You can replace 8 with the number of your choice.

You can also use the following command to do the same thing:

strings --bytes=8 yourfilename

Include Whitespace

By default the strings command includes whitespace such as a tab or space as a printable character. Therefore if you have a string which reads as "the cat sat on the mat" then the strings command would return the whole text.

New line characters and carriage returns are not considered to be printable characters by default.

To get strings to recognise new line characters and carriage returns as a printable character run strings in the following way:

strings -w yourfilename

 

Change The Encoding

There are 5 encoding options available for use with strings:

  • s = 7 bit byte (used for ASCII, ISO 8859)
  • S = 8 bit byte 
  • b = 16 bit bigendian
  • l = 16 bit littleendian

The default is 7 bit byte.

To change the encoding run the following command:

strings -e s yourfilename

strings --encoding=s yourfilename

In the above command I have specified the default "s" which means 7 bit byte. Simply replace the "s" with the encoding letter of your choice.

Change The Binary File Description Name

You can change the behaviour of strings so that it uses a different binary file descriptor library other than the one provided for your system.

This switch is one for the experts. If you have another library to use then you can do so by running the following strings command:

strings -T bfdname 

Reading Options From A File

If you are going to use the same options each time then you don't want to have to specify all the switches each time you run the command because it takes time.

What you can do is create a text file using nano and specify the options within that file.

To try this out within a terminal run the following command:

nano stringsopts

In the file enter the following text:

-f -o -n 3 -s "|"

Save the file by pressing CTRL and O and exit by pressing CTRL and X.

To run the strings commands with these options run the following command:

strings @stringsopts yourfilename

The options will be read from the file stringsopts and you should see the filename before each string, the offset and the "|" as a separator.

Getting Help

If you want to read more about strings you can run the following command to get help.

strings --help

Alternatively you can also read the manual page:

man strings

Find Out Which Version Of Strings You Are Running

To find the version of strings you are running run one of the following commands:

strings -v

strings -V

strings --version