How To Use "bzip2" To Compress Files

Shrink Files Using bz2 Compression
Shrink Files Using bz2 Compression.


The one thing you all know about Linux is that there is a lot of variety. There are hundreds of Linux distributions, with dozens of desktop environments, multiple office suites, graphics packages and audio packages.

Another area where Linux provides variety is when it comes to compressing files. 

Windows users will already know what a zip file is and therefore the "zip" and "unzip" commands will be used to compress and decompress files in the "zip" format.

Another method for compressing files is to use the "gzip" command and to decompress a file with a "gz" extension you can use the "gunzip" command.

In this guide, I will show you another compressing command called "bzip2".

Why Use "bzip2" Over "gzip"?

The "gzip" command uses the LZ77 compression method. The "bzip2" compression tool uses the "Burrows-Wheeler" algorithm.

So which method should you use to compress a file?

If you visit this page you will see that both compression methods have been matched side by side.

The test runs each command using the default compression settings and you will see that the "bzip2" command comes out on top when it comes to reducing the filesize.

However, if you look at the time it takes to compress the file it takes much longer to do so.

It is worth pointing out the 3rd column on the chart which is labeled "lzmash". This is the equivalent of running the "gzip" command with the compression level set to "-9" or to put it in English, "most compressed".

The "lzmash" command takes longer than the "gzip" command by default but the file is reduced considerably and it is smaller than the "bzip2" equivalent. It is also worth noting that it takes less time to do so.

Your decision, therefore, will be how much you wish to compress the files by and how long you are willing to wait for it to happen.

Either way, the "gzip" command is slightly better in both cases.

Compressing Files Using "bzip2".

To compress a file using the "bzip2" format run the following command:

bzip2 filename

The file will be compressed and will now have the extension ".bz2".

The "bzip2" will always try and compress the file even if the file becomes larger as a result. This can happen when you are compressing a file which has already been compressed.

If you try to compress a file which will result in the file with the same name as an existing compressed file then an error will occur.

For example, if you have a file called "file1" and the folder already has a file called "file1.bz2" then upon running the "bzip" command you will see the following output:

bzip2: Output file file1.bz2 already exists

How To Decompress Files

There are many different ways to decompress files which have the "bz2" extension.

You can use the "bzip2" command as follows:

bzip2 -d filename.bz2

This will decompress the file and remove the "bz2" extension.

If by decompressing the file it would cause a file with the same name to be overwritten you will see the following error:

bzip2: Output file filename already exists

A nicer way to decompress files with the "bz2" extension is to use the "bunzip2" command.

With this command you do not need to specify any switches as shown below:

bunzip2 filename.bz2

The "bunzip2" command runs exactly the same way as the "bzip2" command with the minus d (-d) switch.

The "bunzip2" command can extract any valid file that has been compressed using "bzip" or "bzip2". As well as decompressing ordinary files it can also decompress tar files which have been compressed using the "bzip2" command.

By default tar files compressed using the "bzip2" command will have the extension ".tbz2". When you decompress this file using the "bunzip2" command the filename becomes "filename.tar".

If you have a valid file which has been compressed with "bzip2" but it has a different extension than "bzip2" will decompress the file but it will add the ".out" extension to the end of the file. For example "myfile.myf" will become "myfile.out".

How To Force Files To Be Compressed

If you want the "bzip2" command to compress a file regardless as to whether a file with the "bz2" extension already exists then you can use the following command:

bzip2 -f myfile

If you have a file called "myfile" and another called "myfile.bz2" then the "myfile.bz2" file will be overwritten when "myfile" is compressed.

How To Keep Both Files

If you want to keep the file you are compressing and the compressed file you can use the following command:

bzip2 -k myfile

This will keep the "myfile" file but will also compress it and create a "myfile.bz2" file.

You can also use the minus k (-k) switch with the "bunzip2" command to keep both the compressed file and uncompressed file whilst decompressing the file.

Test The Validity Of A "bz2" File

You can test whether a file is compressed with the "bzip2" compression mechanism using the following command:

bzip2 -t filename.bz2

If the file is a valid file then no output will be returned but if the file is not valid you will receive a message saying so. 

Use Less Memory When Compressing Files

If the "bzip2" command is using too many resources whilst compressing a file you can reduce the impact by specifying the minus s (-s) switch as follows:

bzip2 -s filename.bz2

Note that it takes longer to compress a file using this switch.

Get More Information When Compressing Files

By default when you run the "bzip2" or "bunzip2" commands you don't receive any output and the new file just appears.

If you want to know what is happening when you compress or decompress a file you can get more verbose output by specifying the minus v (-v) switch as follows:

bzip2 -v filename

The output will appear as follows:

filename: 1.172:1 6.872 bits/byte 14.66% saved 50341 in 42961 out

The important parts are the percentage saved, the input size and the output size.

Recover Broken Files

If you have a broken "bz2" file then the program to use to try and recover the data is as follows:

bzip2recover filename.bz2