Estimating Family Income From Administrative Banking Data
JPMorgan Chase , 2018
20 Pages Posted: 12 Jan 2019
Date Written: December 17, 2018
At the JPMorgan Chase Institute, we aim to publish generalizable insights that are representative of the overall US population. To do this, we require a method to reweight research based on key characteristics, with income foremost among them. Given that we do not have full coverage of income information across our portfolio of customers, we set out to develop a reliable method for estimating income. Using machine learning techniques, we trained an estimate of gross family income based on a truth set drawn from credit card and mortgage application data. JPMC Institute Income Estimate (JPMC IIE) version 1.0 uses gradient boosting machines (GBM) and relies heavily on administrative banking data such as checking account inflows. It predicts income with a mean absolute error (MAE) of 41 percent, outperforming comparative benchmarks, and demonstrates consistent accuracy across predicted income pentiles (average 55 percent). JPMC IIE version 1.0 is currently in use for research purposes, with results similar to truth set income when used for reweighting purposes. Future versions will seek to improve predictive power and expand the use of the estimate.
JEL Classification: B41, C10, J30
Suggested Citation: Suggested Citation