Can Machine Learning Models Capture Correlations in Corporate Distresses?
32 Pages Posted: 18 Nov 2018 Last revised: 13 Jul 2019
Date Written: July 12, 2019
A number of papers document that recent machine learning models outperform traditional corporate distress models in terms of accurately ranking firms by their riskiness. However, it remains unanswered whether advanced machine learning models can capture correlations in distresses sufficiently well to be used for joint modelling, which traditional distress models often struggle with. We implement a regularly top-performing machine learning model and find that prediction accuracy of individual distress probabilities improves while there is almost no difference in the predicted aggregate distress rate relative to traditional distress models. Thus, our findings suggest that complex machine learning models do not eliminate the excess clustering in distresses. Instead, we propose a frailty model, which allows for correlations in distresses, augmented with regression splines. This model demonstrates competitive performance in terms of ranking firms by their riskiness, while providing accurate aggregate risk measures.
Keywords: corporate default prediction, discrete hazard models, frailty models, gradient boosting
JEL Classification: C55, G17, G33
Suggested Citation: Suggested Citation