Explaining Machine Learning by Bootstrapping Partial Dependence Functions and Shapley Values
66 Pages Posted: 18 Nov 2021 Last revised: 22 Nov 2021
Date Written: October 28, 2021
Machine learning and artificial intelligence methods are often referred to as \black boxes" when compared to traditional regression-based approaches. However both traditional and machine learning methods are concerned with modeling the joint distribution between endogenous (target) and exogenous (input) variables. The fitted models are themselves functionals of the data,
about which we can do statistical inference using computationally intensive methods such as the bootstrap. Where linear models describe the fitted relationship between the target and input variables via the slope of that relationship (coefficient estimates), the same fitted relationship can be described rigorously for any machine learning model by first-differencing the partial dependence functions. Bootstrapping these first-differenced functionals provides standard errors and confidence intervals for the estimated relationships. We show that this approach replicates the point estimates of coefficients attained in a linear OLS models, and demonstrate how this generalizes to marginal relationships in ML/AI models. This paper extends the partial dependence plot described in Friedman (2001), and visualizes the marginal distribution used to construct the PDP as described in Goldstein et al. (2015) before applying the steps described above. We further discuss the extension of PDP into a Shapley value decomposition and explore how it can be used to further explain model
outputs. We conclude with a hedonic house pricing example, which illustrates how machine learning methods such as random forests, deep neural net, and support vector regression automatically capture nonlinearities and shed light on inconsistencies revealed by meta-studies of hedonic house pricing.
JEL Classification: C14, C18, C15
Suggested Citation: Suggested Citation