How to Determine the File Type of a File Using Linux

man at computer
PeopleImages.com/DigitalVision/Getty Images

Introduction

Most people look at the extension of a file and then guess the type of file from that extension. For example when you see a file with an extension of gif, jpg, bmp or png you would think of an image file and when you see a file with an extension of zip you assume the file has been compressed using a zip compression utility.

In truth a file can have one extension but be something altogether different and if a file has no extension how can you determine the file type?

In Linux you can find out the true file type using the file command.

How The File Command Works

According to the documentation the file command runs three sets of tests against a file:

  • filesystem tests
  • magic tests
  • language tests

The first set of tests to return a valid response causes the file type to be printed.

Filesystem tests examine the return from a stat system call. The program checks to see if the file is empty and whether it is a special file. If the file type is found in the system header file it will be returned as the valid file type.

The magic tests checks the contents of a file and specifically a few bytes at the beginning which help to determine the file type. There are various files which are used to help match up a file with its file type and these are stored in /etc/magic, /usr/share/misc/magic.mgc, /usr/share/misc/magic. You can override these files by placing a file in your home folder called $HOME/.magic.mgc or $HOME/.magic.

The final tests are language tests. The file is checked to see if it is a text file. By testing the first few bytes of a file you can deduce whether it is an ASCII, UTF-8, UTF-16 or in another format which determines the file as a text file. Once the character set has been deduced the file is tested against different languages.

For example is the file a c program.

If none of the tests work the output is simply data.

How To Use The File Command

The file command can be used as follows:

file filename

For example imagine you have a file called file1 you would run the following command:

file file1

The output will be something like this:

file1: PNG image data, 640 x 341, 8-bit/color RGB, non-interlaced

The output shown determines file1 to be an image file or to be more exact a portable network graphic (PNG) file.

Different file types produce different results as follows:

  • ISO file type - DOS/MBR boot sector ISO 9660 CD-Rom filesystem data 'label' (bootable); partition 2 : ID = 0xef, start-CHS (0x3ff,254,63), end-CHS (0x3ff,4,63) startsector 1496, 4736 sectors
  • ODS file type - OpenDocument Spreadsheet
  • PDF file type - PDF Document, version 1.4
  • CSV file type - ASCII text, with very long lines, with CRLF line indicators

Customise The Output From The File Command

By default the file command provides the file name and then all the details above the file. If you just want the details without the file name repeated use the following switch:

file -b file1

The output will be something like this:

PNG image data, 640 x 341, 8-bit/color RGB, non-interlaced

You can also change the delimiter between the filename and the type.

By default the delimiter is a colon (:) but you can change it to anything you like such as the pipe symbol as follows:

file -F '|' file1

The output will now be something like this:

file1| PNG image data, 640 x 341, 8-bit/color RGB, non-interlaced

Handling Multiple Files

By default you will use the file command against a single file. You can however specify a filename which contains a list of files to be processed by the file command:

As an example open a file called testfiles using the nano editor and add these lines to it:

  • /etc/passwd
  • /etc/pam.conf
  • /etc/opt

Save the file and run the following file command:

file -f testfiles

The output will be something like this:

/etc/passwd: ASCII text

/etc/pam.conf: ASCII text
/etc/opt: directory

Compressed Files

By default when you run the file command against a compressed file you will see output something like this:

file.zip: ZIP archive data, at least V2.0 to extract

Whilst this tells you that the file is an archive file you don't really know the contents of the file. You can look inside the zip file to see the file types of the files within the compressed file.

The following command runs the file command against the files inside a ZIP file:

file -z filename

The output will now show the file types of files within the archive.

Summary

In general most people will simply just use the file command to find the basic file type but to find out more about all of the possibilities the file command offers type the following into the terminal window:

man file

Was this page helpful?