Track Influenza With Google Flu Trends

Google Flu Trends
Screen Capture

It's not surprising that people search for information about the flu when they're sick. Google found a way to tap this trend and use it to estimate flu activity by region. They discovered that search trend data was actually about two weeks faster than traditional CDC (Center for Disease Control) methods of flu outbreak tracking.

Google Flu Trends will give you an estimate of the current outbreak level in the USA or break it down state by state. You can also see trends from past years and search for a place to find flu shots near you.

Big Data

Google Flu Trends is an example of the discoveries that can be made with "big data," a term used to describe massive structured or unstructured data sets that would be too large and complex to be examined using traditional methods.

Traditional analysis of data usually involved keeping what you collected to a manageable size. Researchers used smaller statistical samples of very large groups in order to make informed guesses about the larger group. For example, political polling is done by calling a relatively small number of people and asking them questions. If the sampling resembles the larger group (say, all voters in Massachusetts), then the survey results of the small group can be used to make guesses about the larger group. You need to have a very clean data set and know what you're searching for. 

Big data, on the other hand, uses data sets as large as possible—say, all the search queries in Google. When you use a data set that large, you also get "messy" data: incomplete entries, search entries by cats walking across keyboards, and so on. It's fine. Big data analysis can take this into account and still end up drawing conclusions that otherwise may not have been found.

One of those discoveries was Google Flu Trends, which looks at spikes in search queries for flu symptoms. You don't always Google, "Hey, I have the flu. OK Google, where is a doctor near me?" You tend to search for things like "a headache and fever." The slight upward trend in an otherwise very messy and large set of search queries is the thing that powers Google Flu Trends. 

This is more than just a novelty since it spots flu spikes faster than the CDC. The CDC relies on positive flu tests from doctors and hospitals. That means that people have to get sick enough to visit a doctor in numbers sufficient to cause a spike in flu testing, and then the labs have to report the trend. People will already be sick by the time you're able to mobilize treatment.