A Bayesian Analysis of Linear Regression Models with Highly Collinear Regressors
30 Pages Posted: 11 Oct 2018
Date Written: October 6, 2018
Abstract
Exact collinearity between regressors makes their individual coefficients not identified. But, given an informative prior, their Bayesian posterior means are well defined. Just as exact collinearity causes non-identification of the parameters, high collinearity can be viewed as weak identification of the parameters, which is represented, in line with the weak instrument literature, by the correlation matrix being of full rank for a finite sample size T, but converging to a rank deficient matrix as T goes to infinity. The asymptotic behaviour of the posterior mean and precision of the parameters of a linear regression model are examined in the cases of exactly and highly collinear regressors. In both cases the posterior mean remains sensitive to the choice of prior means even if the sample size is sufficiently large, and that the precision rises at a slower rate than the sample size. In the highly collinear case, the posterior means converge to normally distributed random variables whose mean and variance depend on the prior means and prior precisions. The distribution degenerates to fixed points for either exact collinearity or strong identification. The analysis also suggests a diagnostic statistic for the highly collinear case. Monte Carlo simulations and an empirical example are used to illustrate the main findings.
A previous version of this paper can be found at: http://ssrn.com/abstract=3076052
Keywords: Bayesian Identification, Multicollinear Regressions, Weakly Identified Regression Coefficients, Highly Collinear Regressors.
JEL Classification: C11, C18
Suggested Citation: Suggested Citation