How To Create A Hexdump Of A File Or String Of Text

Hexdump
Hexdump.

Introduction

A hex dump is a hexadecimal view of data. You may wish to use hexadecimal when debugging a program or to reverse engineer a program.

For example many file formats have specific hex characters to denote their type. If you are trying to read a file using a program and for some reason it isn't loading correctly it might be that the file isn't in the format you are expecting.

If you want to see how a program works and you don't have the source code or indeed a piece of software which reverse engineers the code you can look at the hex dump to try and work out what is happening.

This guide will start off by discussing what hex actually is and will then show you how to create a hex dump from the standard input as well as from files.

What Is Hexadecimal?

Computers think in binary. Every character, number and symbol is referenced by a binary or multiple binary values.

Human beings however tend to think in decimal.

ThousandsHundredsTensUnits
1011

 

As a human our lowest numbers are called units and represent the numbers 0 to 9. When we get to 10 we reset the units column back to 0 and add 1 to the tens column (10). 

1286432168421
10010001

 

In binary the lowest number only represent 0 and 1. When we get past 1 we put a 1 in the 2's column and a 0 in the 1 column. When you want to represent 4 you put a 1 in the 4 column and reset the 2's and 1's column.

Therefore to represent 15 you would have 1111 which stands for 1 eight, 1 four, 1 two and 1 one. (8+4+2+1 = 15).

If we viewed a data file in binary format it would be absolutely huge and virtually impossible to make sense of.

The next step up from binary is octal which uses 8 as the base number

241681
0110

 

In an octal system the first column goes from 0 to 7, the second column is 8 to 15, the third column 16 to 23 and the fourth column 24 to 31 and so on. Whilst generally easier to read than binary most people prefer to use hexadecimal.

Hexadecimal uses 16 as the base number. Now this is where it gets confusing because as humans we think of numbers as 0 through to 9. So what is used for 10,  11, 12, 13, 14, 15? The answer is letters.

  • 0 = 0
  • 1 = 1
  • 2 = 2
  • 3 = 3
  • 4 = 4
  • 5 = 5
  • 6 = 6
  • 7 = 7
  • 8 = 8
  • 9 = 9
  • 10 = A
  • 11 = B
  • 12 = C
  • 13 = D
  • 14 = E
  • 15 = F

The value 100 is therefore represented by 64. You will need 6 of the 16s column which brings up 96 and then 4 in the units column making 100.

All of the characters in a file will be denoted by a hexadecimal value. What these values mean depend on the format of the file itself. The format of the file is denoted by hexadecimal values which are usually stored at the beginning of the file.

With knowledge of the the sequence of hexadecimal values that appear at the beginning of files you can manually work out what format the file is in. Viewing a file in a hex dump can help you find hidden characters that aren't shown when the file is loaded into a normal text editor.

Read here to learn more about hexadecimal

How To Create A Hex Dump Using Linux

To create a hex dump using Linux use the hexdump command.

To display a file as hex to the terminal (standard output) run the following command:

hexdump filename

For example

hexdump image.png

The default output will display the line number (in hexadecimal format) and then 8 sets of 4 hexadecimal values per line.

For example:

00000000 5089 474e 0a0d 0a1a 0000 0d00 4849 5244

You can supply different switches to change the default output. For example specifying the minus b switch will produce an 8 digit offset followed by 16 three column, zero filled, bytes of input data in octal format.

hexdump -b image.png

Therefore the above example will now be represented as follows:

00000000 211 120 116 107 015 012 032 012 000 000 000 015 111 110 104 122

The above format is known as one-byte octal display.

Another way to view the file is in one-byte character display using the minus c switch.

hexdump -c image.png

This again displays the offset but this time followed by sixteen space separated, three column, space filled characters of input data per line.

Other options include Canonical hex+ascii display which can be displayed using the minus C switch and two-byte decimal display which can be displayed using the minus d switch. The minus o switch can be used to display two-byte octal display. Finally the minux x switch can be used to display two-byte hexadecimal display.

hexdump -C image.png

hexdump -d image.png

hexdump -o image.png

hexdump -x image.png

If none of the above formats suit your needs to you use the minus e switch to specify the format.

If you know a data file is very long and you just want to see the first few characters to determine its type you can use the -n switch to specify how much of the file to display in hex.

hexdump -n100 image.png

The above command displays the first hundred bytes.

If you wish to skip a portion of the file you can use the minus s switch to set an offset to start from.

hexdump -s10 image.png

If you don't supply a filename the text is read from the standard input. 

Simply enter the following command:

hexdump

Then enter the text into the standard input and finish by typing quit. The hex will  be displayed to the standard output.

Summary

The hexdump utility is obviously a fairly powerful tool and you should definitely read the manual page to fully get to grips with all of the features.

You would also need a good understanding of what you are looking for when reading the output.

To view the manual page run the following command:

man hexdump