The Sense and Non-Sense of Holdout Sample Validation in the Presence of Endogeneity
Marketing Science, Vol. 30, No. 6, pp. 1115-1122, November-December 2011
42 Pages Posted: 7 Jul 2010 Last revised: 16 Apr 2013
Market response models that use field-generated data are required to address potential endogeneity in the regressors to obtain consistent parameter estimates. Another requirement is that market response models predict well in a holdout sample. Combining both requirements, it may seem reasonable to subject an endogeneity-corrected model to a holdout prediction task, and this is quite common in the academic marketing literature. One may be inclined to expect that the consistent parameter estimates obtained via instrumental variable (IV) estimation predict better than the biased ordinary least squares (OLS) estimates. This paper shows that this expectation is incorrect. That is, if the holdout sample is similar to the estimation sample so that the regressors are endogenous in both samples, holdout sample validation favors regression estimates that are not corrected for endogeneity (i.e., OLS) over estimates that are corrected for endogeneity (i.e., IV estimation). A key take-away is that if consistent parameter estimates are the primary model objective, the model should be validated with an exogenous (rather than endogenous) holdout sample. If prediction is the primary model objective, we recommend refraining from correcting for endogeneity with IV estimation.
Keywords: Instrumental Variables, Holdout Sample Validation, Endogeneity
JEL Classification: C50, C51, C52, C53
Suggested Citation: Suggested Citation