To Explain or To Predict?

Statistical Science

31 Pages Posted: 2 Mar 2009 Last revised: 6 Sep 2010

See all articles by Galit Shmueli

Galit Shmueli

Institute of Service Science, National Tsing Hua University, Taiwan

Date Written: May 24, 2010

Abstract

Statistical modeling is a powerful tool for developing and testing theories by way of causal explanation, prediction, and description. In many disciplines there is near-exclusive use of statistical modeling for causal explanation and the assumption that models with high explanatory power are inherently of high predictive power. Conflation between explanation and prediction is common, yet the distinction must be understood for progressing scientific knowledge. While this distinction has been recognized in the philosophy of science, the statistical literature lacks a thorough discussion of the many differences that arise in the process of modeling for an explanatory versus a predictive goal. The purpose of this paper is to clarify the distinction between explanatory and predictive modeling, to discuss its sources, and to reveal the practical implications of the distinction to each step in the modeling process.

Keywords: Explanatory models, causality, predictive modeling, predictive power, statistical strategy, data mining, scientific research

Suggested Citation

Shmueli, Galit, To Explain or To Predict? (May 24, 2010). Statistical Science, Available at SSRN: https://ssrn.com/abstract=1351252 or http://dx.doi.org/10.2139/ssrn.1351252

Galit Shmueli (Contact Author)

Institute of Service Science, National Tsing Hua University, Taiwan ( email )

Hsinchu, 30013
Taiwan

HOME PAGE: http://www.iss.nthu.edu.tw