Why Do Models that Predict Failure Fail?

65 Pages Posted: 24 Jun 2020

Multiple version iconThere are 2 versions of this paper

Date Written: June 2, 2020


In the first portion of this paper, we utilize millions of loan-level servicing records for mortgages originated between 2004 and 2016 to study the performance of predictive models of mortgage default. We find that the logistic regression model -- the traditional workhorse for consumer credit modeling -- as well as machine learning methods can be very inaccurate when used to predict loan performance in out-of-time samples. Importantly, we find that this model failure was not unique to the early-2000s housing boom.

We use the Panel Study of Income Dynamics in the second part of our paper to provide evidence that this model failure can be attributed to intertemporal heterogeneity in the relationship between variables that are frequently used to predict mortgage performance and the realized post-origination path of variables that have been shown to trigger mortgage default. Our findings imply that model instability is a significant source of risk for lenders, such as financial technology firms ("Fintechs"), that rely heavily on predictive statistical models and machine learning algorithms for underwriting and account management.

Keywords: Mortgages, Predictive Modeling, Machine Learning, Fintech, Lending, Lucas Critique

JEL Classification: G

Suggested Citation

Kiefer, Hua and Mayock, Tom, Why Do Models that Predict Failure Fail? (June 2, 2020). Available at SSRN: https://ssrn.com/abstract=3616889 or http://dx.doi.org/10.2139/ssrn.3616889

Hua Kiefer

FDIC ( email )

550 17th Street NW
Washington, DC 20429
United States

Tom Mayock (Contact Author)

UNC Charlotte ( email )

Charlotte, NC 28223
United States

Do you have negative results from your research you’d like to share?

Paper statistics

Abstract Views
PlumX Metrics