Credit Risk Modeling in the Age of Machine Learning
62 Pages Posted: 18 Nov 2021 Last revised: 25 Apr 2022
Date Written: April 24, 2022
Abstract
Based on the world’s largest loss database of corporate defaults, we perform a comparative analysis of machine learning (ML) methods in credit risk modeling across the globe. We find that ML methods, especially tree-based methods, substantially outperform both simple and sophisticated benchmarks. These results hold across different credit risk parameters, even though we use a uniform modeling framework for the ML methods. We find that the commonly applied out-of-sample validation—as opposed to out-of-time validation—results in inflated performance measures, consistent with an “information leakage channel” that induces information spillovers, particularly for macroeconomic features; this problem is prevalent in many economic contexts. Our results provide guidance for financial institutions, regulatory authorities, and academics.
Keywords: risk management, credit risk modeling, machine learning, forecasting, macroeconomic variables
JEL Classification: C18, C52, C53, C55, G17, G21
Suggested Citation: Suggested Citation