Regression and Machine Learning Methods to Predict Discrete Outcomes in Accounting Research
71 Pages Posted: 11 Mar 2021
Date Written: March 2021
Predictive modeling focuses on iteratively trying various combinations and transformations of a set of variables to generate a decision rule that predicts outcomes for new observations. Although accounting researchers have demonstrated a keen interest in predictive modeling, we identify a lack of accessible and applied guidance on this topic for accounting settings. This issue has become more salient with the increasing availability of machine learning models that use unfamiliar terminology, that can be estimated using several "competing" algorithms, and that produce different outputs than other models used for causal inference. To overcome this gap, we provide an overview of how to predict discrete outcomes with logistic regression and two machine learning models used in recent studies: support vector machines and gradient boosting. We also include guidance and a comprehensive example - predicting investigations by the U.S. Securities and Exchange Commission - that illustrates the elements of the prediction process, highlighting the importance of "out-of-sample" accuracy and unique aspects in the presentation of a prediction model's results.
Suggested Citation: Suggested Citation