How To Search Compressed Files Using Linux

two men at computer
Musketeer/DigitalVision/Getty Images

Introduction

This guide will show you how to search compressed files for a string of text or for a particular expression.

How To Search And Filter Results Using The Grep Command

One of the most powerful Linux commands is grep which stands for "Global Regular Expressions Print". 

You can use grep to search for patterns within the contents of a file or the output from another command.

As an example if you run the following ps command you will see a list of processes which are running on your computer.

ps -ef

The results scroll to the screen quickly and if there are usually a large number of results. This makes viewing the information particularly painful.

You could of course use the more command to list one page of results at a time as follows:

ps -ef | more

Whilst the output from the above command is better than the previous one you still have to page through the results to find what you are looking for.

The grep command makes it possible to filter the results based on the criteria you send to it. For example to search for all processes with the UID set to 'root' run the following command:

ps -ef | grep root

The grep command also works on files. Imagine you have a file which contains a list of book titles. Imagine you want to see if the file contains "Little Red Riding Hood". You can search the file as follows:

grep "Little Red Riding Hood" booklist

The grep command is very powerful and this article will show most of the useful switches that can be used with it.

How To Search Compressed Files Using The zgrep Command

A little known but very powerful tool is zgrep. The zgrep command lets you search the contents of a compressed file without extracting the contents first.

The zgrep command can be used against zip files or files compressed using the gzip command.

What is the difference?

A zip file can contain multiple files whereas a file compressed using the gzip command only contains the original file.

To search for text within a file compressed with gzip you can simply enter the following command:

zgrep expression filetosearch

For example imagine the books list has been compressed using gzip. You can search for the text "little red riding hood" in the compressed file using the following command:

zgrep "Little Red Riding Hood" bookslist.gz

You can use any expression and all of the settings available via the grep command as part of the zgrep command.

How To Search Compressed Files Using The zipgrep Command

The zgrep command works well with files compressed using gzip but doesn't work so well on files compressed using the zip utility.

You can use zgrep if the zip file contains a single file but most zip files contain more than one file.

The zipgrep command is used to search for patterns within a zip file.

As an example imagine you have a file called books with the following titles:

  • Harry Potter And The Chamber Of Secrets
  • Taming Of The Shrew
  • Of Mice And Men
  • The Hitchhikers Guide To The Galaxy
  • Harry Potter And The Order Of The Phoenix

Also imagine you have a file called movies with the following titles

  • The Matrix
  • Harry Potter And The Chamber Of Secrets
  • Harry Potter And The Goblet Of Fire
  • Star Wars - A New Hope

Now imagine these two files have been compressed using the zip format into a file called media.zip.

You can use the zipgrep command to find patterns within all the files within the zip file. For example:

zipgrep pattern filename

For example imagine you wanted to find all the occurrences of "Harry Potter" you would use the following command:

zipgrep "Harry Potter" media.zip

The output will be as follows:

books:Harry Potter And The Chamber Of Secrets

books:Harry Potter And The Order Of The Phoenix

movies:Harry Potter And The Chamber Of Secrets

movies:Harry Potter And The Goblet Of Fire

As you can use any expression with zipgrep that you can use with grep this makes the tool very powerful and it makes searching zip files much simpler than decompressing, searching and then compressing again.

If you only want to search certain files within the zip file you can specify the files to search within the zip file as part of the command as follows:

zipgrep "Harry Potter" media.zip movies

The output will now be as follows

movies:Harry Potter And The Chamber Of Secrets

movies:Harry Potter And The Goblet Of Fire

If you want to search all of the files except for one you can use the following command:

zipgrep "Harry Potter" media.zip -x books

This will produce the same output as before as it is searching all files within media.zip except for books.