How to Run Regression In Excel

Discover interesting correlations in data with regression analysis

Regression in Excel is a way to automate the statistical process of comparing several sets of information to see how changes in independent variables affect changes in dependent variables. If you've ever wanted to find a correlation between two things, using regression analysis in Excel is one of the best ways to do that.

Instructions in this article apply to Excel 2019, Excel 2016, Excel 2013, Excel 2010.

What's the Meaning of Regression?

Regression is a statistical modeling approach that analysts use to determine relationships between multiple variables.

Regression analysis starts with a single variable you're trying to analyze and independent variables you're testing to see if they affect that single variable. The analysis looks at changes in the independent variables and attempts to correlate those changes with resulting changes in the single (dependent) variable.

This may sound like advanced statistics, but Excel makes this complex analysis available to anyone.

Performing Linear Regression in Excel

The simplest form of regression analysis is linear regression. Simple linear regression looks at the relationship between only two variables.

For example, the following spreadsheet shows data containing the number of calories a person ate each day and their weight on that day.

Screenshot of a weight and calorie spreadsheet

Since this spreadsheet contains two columns of data, and one variable could potentially have an affect on the other, you can run a regression analysis on this data using Excel.

Enabling Analysis ToolPak Add-On

Before you can use Excel's regression analysis feature, you need to enable the Analysis ToolPak add-on in the Excel Options screen.

  1. In Excel, select the File menu and choose Options.

    Screenshot of Options in Excel
  2. Select Add-ins in the left navigation menu. Then, make sure Excel Add-ins is selected in the Manage field.

    Excel Options
  3. Finally, select the Go button.

    Excel Add-in options and Go button
  4. In the Add-ins pop-up window. Enable Analysis ToolPack by clicking the box in front of it to add a check mark and select OK.

    Excel Add-Ons pop-up window

Now that Analysis ToolPak is enabled, you're ready to start doing regression analysis in Excel.

How to Perform Simple Linear Regression in Excel

Using the weight and calories spreadsheet as an example, you can perform a linear regression analysis in Excel as follows.

  1. Select the Data menu. Then, in the Analysis group, select Data Analysis.

    Selecting Data Analysis in Excel
  2. In the Data Analysis window, select Regression from the list and click OK.

    Selecting Regression data analysis in Excel
  3. The Input Y Range is the range of cells that contains the dependent variable. In this example, that's the weight. The Input X Range is the range of cells that contains the independent variable. In this example, that's the calorie column.

    Fields for entering regression ranges in Excel
  4. Select Labels for the header cells, and then select New Worksheet to send the results to a new worksheet. Select OK to have Excel run the analysis and send the results into a new sheet.

    Excel fields for applying labels and sending regression to a new worksheet
  5. Examine the new worksheet. The analysis output has a number of values that you need to understand to interpret the results.

    Screenshot of regression analysis output in Excel

    Each of these numbers has the following meanings:

    • Multiple R: The Correlation Coefficient. 1 indicates a strong correlation between the two variables, while -1 means there's a strong negative relationship. 0 means there's no correlation.
    • R Square: The Coefficient of Determination, which shows how many points between the two variables fall on the regression line. Statistically, this is the sum of the squared deviations from the mean.
    • Adjusted R Square: A statistical value called R square that's adjusted for the number of independent variables you've chosen.
    • Standard Error: How precise the regression analysis results are. If this error is small then your regression results are more accurate.
    • Observations: The number of observations in your regression model.

    The remaining values in the regression output give you details about smaller components in the regression analysis.

    • df: Statistical value known as degrees of freedom related to the sources of variance.
    • SS: Sum of squares. The ratio of the residual sum of squares versus the total SS should be smaller if most of your data fits the regression line.
    • MS: Mean square of the regression data.
    • F: The F statistic (F-test) for null hypothesis. This provides the significance of the regression model.
    • Significance F: Statistical value known as P-value of F.

    Unless you understand statistics and calculating regression models, the values at the bottom of the summary won't have a lot of meaning. However the Multiple R and R Square are the two most important.

As you can see, in this example, calories have a strong correlation to total weight.

Multiple Linear Regression Analysis in Excel

To perform the same linear regression but with multiple independent variables, select the entire range (multiple columns and rows) for the Input X Range.

Screenshot of selecting a range for Input X Range

When selecting multiple independent variables, it's less likely you'll find as strong a correlation because there are so many variables.

However a regression analysis in Excel can help you find correlations with one or more of those variables that you may not realize exists just by reviewing the data manually.