What is Data Mining?

Big companies know more about you than you could ever imagine – here’s how

Man looking at data in office
YakobchukOlena/iStock

Data mining is the analysis of large amounts of data to discover patterns and knowledge. In fact, data mining is also known as data discovery or knowledge discovery.

Data mining uses statistics, principles of machine learning (ML), artificial intelligence (AI), and vast amounts of data (often from databases or data sets) to identify patterns in a way that’s as automated and useful as possible.

What Does Data Mining Do?

Data mining has two primary objectives: description and prediction.

First, data mining describes the insights and knowledge obtained from analyzing patterns in data. Second, data mining uses the descriptions of recognized data patterns to predict future patterns.

For example, if you have spent time browsing on a shopping website for books about how to identify different types of plants, the data mining services working behind the scenes on that website log a description of your searches in connection with your profile. When you log in again two weeks later, the website’s data mining services use the descriptions of your previous searches to predict your current interests and offer personalized shopping recommendations that include books about identifying plants.

How Data Mining Works

Data mining works using algorithms, sets of instructions that tell a computer or process how to do a task, to discover different types of patterns within data. A few of the different pattern recognition methods used in data mining include cluster analysis, anomaly detection, association learning, data dependencies, decision trees, regression models, classifications, outlier detection, and neural networks.

While data mining can be used to describe and predict patterns in all different kinds of data, the use many people encounter most often, even if they don’t realize it, is to describe patterns in your purchasing choices and behaviors to predict likely future purchasing decisions.

As an example, have you ever wondered how Facebook always seems to know what you’ve been looking at online and shows you ads in your newsfeed related to other sites you’ve visited or your web searches?

Facebook data mining uses information stored in your browser that tracks your activities, such as cookies, together with its own knowledge of your patterns based on your previous use of Facebook’s service to discover and predict products or offerings you may be interested in.

What Kind of Data Can Be Mined?

Depending on the service or store (physical stores use data mining also), a surprising amount of data about you and your patterns can be mined. Data collected about you may include what type of vehicle you drive, where you live, places you’ve traveled, magazines and newspapers you subscribe to, and whether or not you are married. It can also determine whether or not you have children, what your hobbies are, which band you like, your political leanings, what you buy online, what you buy in physical stores (often through customer loyalty reward cards), and any details you share about your life on social media.

For instance, retailers and fashion-based publications targeted at teenagers use insights from data mining photos on social media services like Instagram and Facebook to predict fashion trends that will lure in teen shoppers or readers. The insights discovered through data mining can be so precise that some retailers can even predict if a woman might be pregnant, based on very specific changes in her buying choices.

The retailer, Target, is reported to be so accurate with predicting pregnancy based on patterns in buying history that it mailed coupons for baby products to a young lady, giving away her pregnancy secret before she told her family.

Data mining is everywhere, however, much of the information discovered and analyzed about our buying habits, personal preferences, choices, finances, and online activities is used by stores and services with the intention of enhancing customer experience.