Assumptions Behind the Linear Regression Model

10 Pages Posted: 5 Apr 2010

See all articles by Phillip E. Pfeifer

Phillip E. Pfeifer

University of Virginia - Darden School of Business

Abstract

In a previous note, “Introduction to Least-Squares Modeling” (UVA-QA-0500), we have seen how least squares can be used to fit the simple linear model to historical data. The resulting model can then be used to forecast the next occurrence of Y, the dependent variable, for a given value of X, the independent variable. This use of least squares to fit a forecasting model requires no assumptions. It can be applied to almost any situation, and a reasonable forecast results. At this level of analysis, least-squares modeling is equivalent simply to fitting a straight line through a cloud of points and interpolating or extrapolating for a new value of Y for a given X using the fitted line.

Excerpt

UVA-QA-0271

ASSUMPTIONS BEHIND THE LINEAR REGRESSION MODEL

In a previous note, “Introduction to Least-Squares Modeling” (UVA-QA-0500), we have seen how least squares can be used to fit the simple linear model to historical data. The resulting model can then be used to forecast the next occurrence of Y, the dependent variable, for a given value of X, the independent variable. This use of least squares to fit a forecasting model requires no assumptions. It can be applied to almost any situation, and a reasonable forecast results. At this level of analysis, least-squares modeling is equivalent simply to fitting a straight line through a cloud of points and interpolating or extrapolating for a new value of Y for a given X using the fitted line.

Although we need not make any assumptions to use this procedure, we leave an important question unanswered: How close can we expect the new Y to be to our forecast? Without some additional assumptions, we have no way of making a probability statement about the new Y. In many practical business situations, such a probability statement is an essential element in the decision-making process.

There is a procedure for measuring the uncertainty associated with a least-squares forecast that will produce a complete probability distribution for a new Y. This procedure brings real value and legitimacy to the regression-modeling and forecasting process, changing it from a simple process—one step above graph paper and a ruler—to one that intelligently combines managerial judgment and statistical theory to produce believable point and interval forecasts.

That's the good news. The inevitable bad news is that in order to make probability statements about a new Y using a least-squares regression model, a variety of assumptions must be made. In other words, probability statements made using linear regression theory are true only if certain assumptions hold. You can thus see the importance of (1) understanding these assumptions, (2) knowing how to check their validity, (3) understanding the consequences of an incorrect assumption, and (4) knowing what can be done if the assumptions do not hold. This note addresses each of these four points for the four general assumptions behind linear regression. The model must be checked for (1) linearity (2) independence (3) homoskedasticity and (4) normality.

. . .

Keywords: management science

Suggested Citation

Pfeifer, Phillip E., Assumptions Behind the Linear Regression Model. Darden Case No. UVA-QA-0271, Available at SSRN: https://ssrn.com/abstract=1584516

Phillip E. Pfeifer (Contact Author)

University of Virginia - Darden School of Business ( email )

P.O. Box 6550
Charlottesville, VA 22906-6550
United States
434-924-4803 (Phone)

HOME PAGE: http://www.darden.virginia.edu/faculty/Pfeifer.htm

Here is the Coronavirus
related research on SSRN

Paper statistics

Downloads
44
Abstract Views
563
PlumX Metrics