A Machine Learning Approach to Improving Occupational Income Scores
32 Pages Posted: 3 Apr 2017 Last revised: 3 Feb 2018
Date Written: February 3, 2018
Historical studies of labor markets frequently suffer from a lack of data on individual income. The occupational income score (OCCSCORE) is often used as an alternative measure of labor market outcomes. Using modern Census data, we find that the use of OCCSCORE biases results towards zero and can frequently result in statistically significant coefficients of the wrong sign. We use a machine learning approach to construct a new adjusted score based on industry, occupation, and individual demographics. Our alternative score substantially outperforms OCCSCORE in both modern and historical contexts. We illustrate our approach by estimating racial and gender earnings gaps in the 1915 Iowa State Census and intergenerational mobility elasticities using linked data from the 1850-1910 Censuses.
Keywords: OCCSCORE, occupational income score, LIDO score, machine learning, lasso, non-classical measurement error, occupation, earnings gaps
JEL Classification: C21, C43, J71, N32
Suggested Citation: Suggested Citation