Email Headers Can Tell You About the Origin of Spam

Find out where junk mail is coming from

Spam will end when it is no longer profitable. Spammers will see their profits tumble if nobody buys from them (because you don't even see the junk emails). This is the easiest way to fight spam, and certainly one of the best.

Complaining About Spam

You can affect the expenses side of a spammer's balance sheet, too. If you complain to the spammer's Internet Service Provider (ISP), they will lose their connection and might have to pay a fine (depending on the ISP's acceptable usage policy).

Since spammers know and fear such reports, they try to hide. That's why finding the right ISP is not always easy. Fortunately, there are tools like SpamCop that simplify reporting spam correctly to the accurate address.

Woman under a pile of Spam mail
Tim Robberts / Stone / Getty Images

Determining the Source of Spam

How does SpamCop find the right ISP to complain to? It takes a close look at the spam message's header lines. These headers contain information about the path an email took.

SpamCop follows the path until the point from which the spammer sent the email. From this point, also know as an IP address, it can derive the spammer's ISP and send the report to this ISP's abuse department.

Let's take a closer look at how this works.

Email Header and Body

Every email message consists of two parts, the body and the header. The header is like the email envelope containing the sender's address, the recipient, the subject, and other information. The body has the actual text and the attachments.

Some header information usually displayed by your email program includes:

  • From The sender's name and email address.
  • To The recipient's name and email address.
  • Date The date when the message was sent.
  • Subject The subject line.

Header Forging

The actual delivery of emails does not depend on any of these headers. They are just convenient.

Usually, the From line, for example, will be sent to the sender's address so you know who the message is from and can reply quickly.

Spammers want to make sure you cannot reply easily, and certainly don't want you to know who they are. That's why they insert fictitious email addresses in the From lines of their junk messages.

Received Lines

So the From line is useless if we want to determine the real source of an email. Fortunately, we need not rely on it. The headers of every email message also contain Received lines.

Email programs do not usually display these, but they can be beneficial in tracing spam. 

Parsing Received Header Lines

Just like a postal letter will go through several post offices on its way from sender to recipient, an email message is processed and forwarded by several mail servers.

Imagine every post office putting a unique stamp on each letter. The stamp would say exactly when the mail was received, where it came from, and where it was forwarded to by the post office. If you got the letter, you could determine the exact path taken by the letter.

This is precisely what happens with email.

Received Lines for Tracing

As a mail server processes a message, it adds a particular line to the message's header. The Received line contains, most interestingly, the server name and IP address of the machine the server received the message from the name of the mail server itself.

The Received line is always at the top of the message headers. To reconstruct an email's journey from sender to a recipient, we start at the topmost Received line (why we do this will become apparent in a moment) and walk our way down until we have arrived at the last one, which is where the email originated.

Received Line Forging

Spammers know that we will apply this procedure to uncover their whereabouts. They might insert forged Received lines that point to somebody else sending the message to fool us.

Since every mail server will always put its Received line at the top, the spammers' forged headers can only be at the bottom of the Received line chain. This is why we start our analysis at the top and don't just derive the point where an email originated from the first Received line (at the bottom).

How to Tell a Forged Received Header Line

The forged Received lines inserted by spammers to fool us will look like all the other Received lines (unless they make an obvious mistake, of course). By itself, you can't tell a forged Received line from a genuine one, which is where one distinct feature of Received lines comes into play. As we've noted above, every server will not only note who it is but also where it got the message from (in IP address form).

We compare what a server claims to be with what the server one notch up in the chain says it is. If the two don't match, the earlier is a forged Received line.

In this case, the email's origin is what the server immediately after the forged Received says.

Are you ready for an example?

Example Spam Analyzed and Traced

Now that we know the theoretical underpinning, let's analyze a junk email to identify its origin in real life.

We've just received an exemplary piece of spam that we can use for exercise. Here are the header lines:

Received: from unknown (HELO 38.118.132.100) (62.105.106.207) by mail1.infinology.com with SMTP; 16 Nov 2003 19:50:37 -0000 Received: from [235.16.47.37] by 38.118.132.100 id ; Sun, 16 Nov 2003 13:38:22 -0600 Message-ID: From: "Reinaldo Gilliam" Reply-To: "Reinaldo Gilliam" To: ladedu@ladedu.com Subject: Category A Get the meds u need lgvkalfnqnh bbk Date: Sun, 16 Nov 2003 13:38:22 GMT X-Mailer: Internet Mail Service (5.5.2650.21) MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="9B_9.._C_2EA.0DD_23" X-Priority: 3 X-MSMail-Priority: Normal

Can you tell the IP address where the email originated?

Sender and Subject

First, look at the forged From line. The spammer wants to make it look like the message came from a Yahoo! Mail account. With the Reply-To line, this From address aims to direct all bouncing messages and angry replies to a non-existing Yahoo! Mail account.

Next, the Subject is a curious accumulation of random characters. It is barely legible and designed to fool spam filters (every message gets a slightly different set of random characters). Still, it is also quite skillfully crafted to get the message across despite this.

The Received Lines

Finally, the Received lines. Let's begin with the oldest, Received: from [235.16.47.37] by 38.118.132.100 id ; Sun, 16 Nov 2003 13:38:22 -0600. There are no hostnames in it, but two IP addresses: 38.118.132.100 claims to have received the message from 235.16.47.37. If this is correct, 235.16.47.37 is where the email originated, and we'd find out which ISP this IP address belongs to, then send an abuse report to them.

Let's see if the next (and in this case last) server in the chain confirms the first Received line's claims: Received: from unknown (HELO 38.118.142.100) (62.105.106.207) by mail1.infinology.com with SMTP; 16 Nov 2003 19:50:37 -0000.

Since mail1.infinology.com is the last server in the chain and indeed "our" server, we know that we can trust it. It has received the message from an "unknown" host claiming to have the IP address 38.118.132.100 (using the SMTP HELO command). So far, this is in line with what the previous Received line said.

Now let's see where our mail server did get the message from. To find out, we take a look at the IP address in brackets immediately before by mail1.infinology.com. This is the IP address the connection was established from, and it is not 38.118.132.100. No, 62.105.106.207 is where this piece of junk mail was sent from.

With this information, you can now identify the spammer's ISP and report the unsolicited email to them to kick the spammer off the net.