Shapley Value Regression and the Resolution of Multicollinearity
30 Pages Posted: 22 Jun 2016 Last revised: 17 Apr 2017
Date Written: June 17, 2016
Multicollinearity in empirical data violates the assumption of independence among the explanatory variables in a linear regression model and by inflating the standard error of estimates of the estimated regression coefficients leads to failure in rejecting a false null hypothesis of ineffectiveness of the regressor variable to the regressand variable (type II error). Very frequently, it also affects the sign of the regression coefficients. Shapley value regression is one of the best methods to combat this adversity to empirical analysis. To this end, the present paper has made two contributions, first in simplifying the algorithm to compute the Shapley value (decomposition of R^2 as fair shares to individual regressor variables) and secondly a computer program that works it out easily. It also computes standardized as well as regular regression coefficients from the Shapley value. Obviously, the coefficients are not OLS-optimal. Yet, it must be mentioned that the Shapley value regression becomes increasingly impracticable as the number of regressor variables exceeds 10 or 12, although, in practice, a good regression model should not have more than ten regressors.
Keywords: shapley value regression, multicollinearity, algorithm, computer program Fortran
JEL Classification: C63, C71
Suggested Citation: Suggested Citation