What You Need to Know About Bayesian Spam Filtering

Find out how statistics help keep your inbox clean

Thomas Bayes
Thomas Bayes (1702–1761) of Bayes' theorem. public domain

Bayesian spam filters calculate the probability of a message being spam based on its contents. Unlike simple content-based filters, Bayesian spam filtering learns from spam and from good mail, resulting in a very robust, adapting and efficient anti-spam approach that, best of all, returns hardly any false positives.

How Do You Recognize Junk Email?

Think about how you detect spam. A quick glance is often enough.

You know what spam looks like, and you know what good mail looks like.

The probability of spam looking like good mail is around… zero.

Scoring Content-Based Filters Do Not Adapt

Would it not be great if automatic spam filters worked like that, too?

Scoring content-based spam filters try just that. They look for words and other characteristics typical of spam. Every characteristic element is assigned a score, and a spam score for the whole message is computed from the individual scores. Some scoring filters also look for characteristics of legitimate mail, lowering a message's final score.

The scoring filters approach does work, but it also has several drawbacks.

  • The list of characteristics is built from the spam (and the good mail) available to the filter's engineers. To get a good grasp of the typical spam anybody might get, mail must be collected at hundreds of email addresses. This weakens the efficiency of the filters, especially because the characteristics of good mail will be different for each person, but this is not taken into account.
  • The characteristics to look for are more or less set in stone. If the spammers make the effort to adapt (and make their spam look like good mail to the filters), the filtering characteristics have to be tweaked manually — an even bigger effort.
  • The score assigned to each word is probably based on a good estimate, but it is still arbitrary. And like the list of characteristics it does adapt neither to the changing world of spam in general nor to an individual user's needs.

    Bayesian Spam Filters Tweak Themselves, Getting Better and Better

    Bayesian spam filters are a kind of scoring content-based filters, too. Their approach does away with the problems of simple scoring spam filters, though, and it does so radically. Since the weakness of scoring filters is in the manually built list of characteristics and their scores, this list is eliminated.

    Instead, Bayesian spam filters build the list themselves. Ideally, you start with a (big) bunch of emails that you have classified as spam, and another bunch of good mail. The filters look at both, and analyze the legitimate mail as well as the spam to calculate the probability of various characteristics appearing in spam, and in good mail.

    How a Bayesian Spam Filter Examines an Email

    The characteristics a Bayesian spam filter can look at can be

    • the words in the body of the message, of course, and
    • its headers (senders and message paths, for example!), but also
    • other aspects such as HTML/CSS code (like colors and other formatting), or even
    • word pairs, phrases and
    • meta information (where a particular phrase appears, for example).

    If a word, "Cartesian" for example, never appears in spam but often in the legitimate email you receive, the probability that "Cartesian" indicates spam is near zero. "Toner", on the other hand, appears exclusively, and often, in spam. "Toner" has a very high probability of being found in spam, not much below 1 (100%).

    When a new message arrives, it is analyzed by the Bayesian spam filter, and the probability of the complete message being spam is calculated using the individual characteristics.

    Assume a message contains both "Cartesian" and "toner". From these words alone it's not yet clear whether we have spam or legit mail. Other characteristics will (hopefully and most probably) indicate a probability that allows the filter to classify the message as either spam or good mail.

    Bayesian Spam Filters Can Learn Automatically

    Now that we have a classification, the message can be used to train the filter itself further. In this case, either the probability of "Cartesian" indicating good mail is lowered (if the message containing both "Cartesian" and "toner" is found to be spam), or the probability of "toner" indicating spam must be reconsidered.

    Using this auto-adaptive technique, Bayesian filters can learn from both their own and the user's decisions (if she manually corrects a misjudgment by the filters). The adaptability of Bayesian filtering also makes sure they are most effective for the individual email user.

    While most people's spam may have similar characteristics, the legitimate mail is characteristically different for everybody.

    How Can Spammers Get Past Bayesian Filters?

    The characteristics of legitimate mail are just as important for the Bayesian spam filtering process as the spam is. If the filters are trained specifically for every user, spammers will have an even harder time working around everybody's (or even most people's) spam filters, and the filters can adapt to almost everything spammers try.

    Spammers will only make it past well-trained Bayesian filters if they

    • make their spam messages look perfectly like the ordinary email everybody may get.

    Spammers do not usually send such ordinary emails. Let us assume this is because these emails do not work as junk email. So, chances are they will not be doing it when ordinary, boring emails are the only way to make it past spam filters.

    If spammers do switch to mostly ordinary-looking emails, however, we will see a lot of spam in our Inboxes again, and email will may become as frustrating as it was in pre-Bayesian days (or even worse). It will also have ruined the market for most kinds of spam, though, and thus won't last for long.

    Strong Indicators Can be a Bayesian Spam Filter's Achilles' Heel

    One exception can be perceived for spammers to work their way through Bayesian filters even with their usual content. It is in the nature of Bayesian statistics that

    • one word or characteristic that very frequently appears in good mail can be so significant as to turn any message from looking like spam to being rated as ham by the filter.

    If spammers find a way to determine your sure-fire good-mail words—by using HTML return receipts to see which messages you opened, for example—, they can include one of them in a junk mail and reach you even through a well-trained Bayesian filter.

    John Graham-Cumming has tried this by letting two Bayesian filters work against each other, the "bad" one adapting to which messages are found to get through the "good" filter. He says it works, though the process is time consuming and complex. I don't think we will see much of this happening, at least not on a large scale, and not tailored to individuals' email characteristics. Spammers may (try to) figure out some key words for organizations (something like "Almaden" for some people at IBM maybe?) instead.

    Usually, spam will always be (significantly) different from regular mail or it will not be spam, though.

    The Bottom Line: Bayesian Filtering's Strength Can Be its Weakness

    Bayesian spam filters are content-based filters that

    • are specifically trained to recognize the individual email user's spam and good mail, making them highly effective and difficult to adapt to for spammers.
    • can continually and without much effort or manual analysis adapt to the spammers' latest tricks.
    • take the individual user's good mail into account and have a very low rate of false positives.

    Unfortunately, if this causes blind trust in Bayesian anti-spam filters, it renders the occasional mistake even more serious. The opposite effect of false negatives (spam that looks exactly like regular mail) has the potential to disturb and frustrate users.