A Machine Learning Approach to Improving Occupational Income Scores

32 Pages Posted: 3 Apr 2017 Last revised: 3 Feb 2018

See all articles by Martin Hugo Saavedra

Martin Hugo Saavedra

Oberlin College - Department of Economics

Tate Twinam

University of Washington, Bothell

Date Written: February 3, 2018

Abstract

Historical studies of labor markets frequently suffer from a lack of data on individual income. The occupational income score (OCCSCORE) is often used as an alternative measure of labor market outcomes. Using modern Census data, we find that the use of OCCSCORE biases results towards zero and can frequently result in statistically significant coefficients of the wrong sign. We use a machine learning approach to construct a new adjusted score based on industry, occupation, and individual demographics. Our alternative score substantially outperforms OCCSCORE in both modern and historical contexts. We illustrate our approach by estimating racial and gender earnings gaps in the 1915 Iowa State Census and intergenerational mobility elasticities using linked data from the 1850-1910 Censuses.

Keywords: OCCSCORE, occupational income score, LIDO score, machine learning, lasso, non-classical measurement error, occupation, earnings gaps

JEL Classification: C21, C43, J71, N32

Suggested Citation

Saavedra, Martin Hugo and Twinam, Tate, A Machine Learning Approach to Improving Occupational Income Scores (February 3, 2018). Available at SSRN: https://ssrn.com/abstract=2944870 or http://dx.doi.org/10.2139/ssrn.2944870

Martin Hugo Saavedra

Oberlin College - Department of Economics ( email )

Oberlin, OH 44074
United States

HOME PAGE: http://sites.google.com/view/martinsaavedra/

Tate Twinam (Contact Author)

University of Washington, Bothell ( email )

18115 Campus Way NE
Bothell, WA 98011
United States

Here is the Coronavirus
related research on SSRN

Paper statistics

Downloads
145
Abstract Views
1,047
rank
215,360
PlumX Metrics