What Exactly Is 'Big Data'?

And why is it a big deal?

Big data is the new science of understanding and predicting human behavior by studying large volumes of unstructured data. Big data is also known as 'predictive analytics'.

Analyzing Twitter posts, Facebook feeds, eBay searches, GPS trackers, and ATM machines are some big data examples. Studying security videos, traffic data, weather patterns, flight arrivals, cell phone tower logs, and heart rate trackers are other forms. Big data is a messy new science that changes weekly, and only a few experts understand it all.

Examples of Big Data in Regular Life

Transit Time, NYC screenshot


While most big data projects are very obscure, there are successful examples of big data affecting the everyday life of individuals, companies, and governments:

Predicting virus outbreaks: by studying socio-political data, weather and climate data, and hospital/clinical data, these scientists are now predicting dengue fever outbreaks with 4 weeks advance notice.

Homicide Watch: this big data project profiles murder victims, suspects, and criminals in Washington, DC. Both as a way to honor the deceased and as an awareness resource for people, this big data project is fascinating.

Transit Travel Planning, NYC: WNYC radio programmer Steve Melendez combined the online subway schedule with travel itinerary software. His creation lets New Yorkers click their location on the map, and a prediction of travel time for trains and subway will appear.

Xerox reduced their workforce loss: call center work is emotionally exhausting. Xerox has studied reams of data with the help of professional analysts, and now they can predict which call center hires are likely to stay with the company the longest.

Supporting counter-terrorism: by studying social media, financial records, flight reservations, and security data, law enforcement can predict and locate terrorist suspects before they do their wicked deeds.

Adjusting brand marketing based on social media reviews: people bluntly and quickly share their online thoughts on a pub, restaurant, or fitness club. It is possible to study these millions of social media posts and provide feedback to the company on what people think of their services.

Who Uses Big Data? What Do They Do With It?

Computer network connection modern city future technology - enhanced photo

alexsl / Getty Images

Many monolithic corporations use big data to adjust their offerings and prices to maximize customer satisfaction.

  • Macy's department store: uses big data to adjust its prices on the fly for over 70 million products. They even send customized emails to their customers based on what Macy's believes they are interested in.
  • Police response to the Boston Marathon bombing: by using big data to study video and surveillance images, the police were able to quickly narrow down their search for the suspects.
  • Morton's Steakhouse: uses Twitter to pull off marketing stunts, including the famous New Jersey airport delivery of a porterhouse steak and shrimp dinner.
  • Visa uses big data to identify and catch fraudsters: Single transactions here and there can easily conceal a dishonest credit card user, but by watching millions of transactions carefully, patterns of fraud can be detected.
  • Facebook uses big data to tailor advertising: By carefully studying your FB likes and browsing habits, the social media giant has eerie insight into your tastes. Those sidebar ads you see on your Facebook feed are chosen by very deliberate and complex algorithms that have been watching your Facebook habits.

Why Is Big Data Such a Big Deal?

1. The data is massive It won't fit on a single hard drive, much less a USB stick. The volume of data far exceeds what the human mind can perceive (think of a billion billion megabytes, and then multiply that by more billions). 

2. The data is messy and unstructured — 50% to 80% of big data work is converting and cleaning the information so that it is searchable and sortable. Only a few thousand experts on our planet fully know how to do this data cleanup. These experts also need very specialized tools, like HPE and Hadoop, to do their craft. Perhaps in 10 years, big data experts will become a dime a dozen, but for now, they are a very rare species of analyst and their work is still very obscure and tedious.

3. Data has become a commodity that can be sold and bought — Data marketplaces exist where companies and individuals can buy terabytes of social media and other data. Most of the data is cloud-based, as it is too large to fit onto any single hard disk. Buying data commonly involves a subscription fee where you plug into a cloud server farm.

The leaders of big data tools and ideas are Amazon, Google, Facebook, and Yahoo. Because these companies serve so many millions of people with their online services, it makes sense that they would be the collection point and the visionaries behind big data analytics.

4. The possibilities of big data are endless  Perhaps doctors will one day predict heart attacks and strokes for individuals weeks before they happen. Airplane and automobile crashes might be reduced by predictive analyses of their mechanical data and traffic and weather patterns. Online dating might be improved by having big data predictors of who are compatible personalities for you. Musicians might get insight into what music composition is the most pleasing to the changing tastes of target audiences. Nutritionists might be able to predict which combination of store-bought foods will aggravate or help a person's medical conditions. The surface has only been scratched, and discoveries in big data happen every week.

Big Data Is Messy

Big data is predictive analytics the converting of massive, unstructured data into something searchable and sortable. This is a messy and chaotic space that requires a special kind of knowledge and patience.

Take for example the monolithic UPS delivery service. The programmers at UPS study data from their drivers' GPS and smartphones to analyze the most efficient ways to adapt to traffic congestion. This GPS and smartphone data is gargantuan, and not automatically ready for analysis. This data pours in from various GPS and map databases, through different smartphone hardware devices. UPS analysts have spent months converting all that data into a format that can be easily searched and sorted. The effort has been worth it, though. Today, UPS has saved over 8 million gallons of fuel since they started using these big data analytics.

Because big data is messy and requires so much effort to clean up and prepare for usage, data scientists have become nicknamed 'data janitors' for all the tedious work they do. ​

The science of big data and predictive analytics is improving every week, though. Expect big data to become readily accessible to everyone by the year 2025.

Is Big Data an Intrusive Threat to Privacy?

Yes, if our laws and individual privacy defenses are not carefully managed, then big data intrudes into personal privacy. As it stands, Google and YouTube and Facebook already track your daily online habits. Your smartphone and computing life leaves digital footprints every day, and sophisticated companies are studying those footprints.

The laws around big data are evolving. Privacy is a state of being that you must now take personal responsibility for, as you can no longer expect it as a default right.

What You Can Do to Protect Your Privacy

The biggest single step you can take is to cloak your daily habits using a VPN. A VPN service will scramble your signal ​so that your identity and location are at least partially masked from trackers. This will not make you 100% anonymous, but a VPN will substantially reduce how much the world can observe your online habits.

Was this page helpful?