Goodhart's Law and Machine Learning: A Structural Perspective
32 Pages Posted: 23 Jul 2020 Last revised: 3 May 2021
Date Written: April 15, 2021
Abstract
We develop a structural framework illustrating how penalized regression algorithms affect Goodhart bias when training data is clean but covariates are manipulated at cost by future agents facing prediction models. With quadratic manipulation costs, bias is proportional to sum-of-squared slopes, micro-founding Ridge. Lasso is micro-founded in the limit under increasingly steep cost functions. However, standard penalization is inappropriate if costs depend upon percentage rather than absolute manipulation. Nevertheless, with known costs of either form, the following algorithm is proven manipulation-proof: Within training data, evaluate candidate coefficient vectors at their respective incentive-compatible manipulation configuration. Moreover, we obtain analytical expressions for the resulting coefficient adjustments: slopes (intercept) shift downward if costs depend upon percentage (absolute) manipulation. Statisticians ignoring agent borne manipulation costs select socially suboptimal penalization, resulting in socially excessive, and futile, manipulation. Model averaging, especially over Lasso or ensemble estimators, reduces manipulation costs significantly. Standard cross-validation fails to detect Goodhart bias.
Keywords: Machine Learning, Goodhart's Law, Structural Models
Suggested Citation: Suggested Citation