Identifying Urban Areas by Combining Human Judgment and Machine Learning: An Application to India

45 Pages Posted: 25 Feb 2020 Last revised: 28 Feb 2020

See all articles by Virgilio Galdo

Virgilio Galdo

Michigan State University - Economics

Yue Li

World Bank

Martin Rama

World Bank

Date Written: February 24, 2020

Abstract

This paper proposes a methodology for identifying urban areas that combines subjective assessments with machine learning, and applies it to India, a country where several studies see the official urbanization rate as an under-estimate. For a representative sample of cities, towns and villages, as administratively defined, human judgment of Google images is used to determine whether they are urban or rural in practice. Judgments are collected across four groups of assessors, differing in their familiarity with India and with urban issues, following two different protocols. The judgment-based classification is then combined with data from the population census and from satellite imagery to predict the urban status of the sample. The Logit model, and LASSO and random forests methods, are applied. These approaches are then used to decide whether each of the out-of-sample administrative units in India is urban or rural in practice. The analysis does not find that India is substantially more urban than officially claimed. However, there are important differences at more disaggregated levels, with "other towns" and "census towns" being more rural, and some southern states more urban, than is officially claimed. The consistency of human judgment across assessors and protocols, the easy availability of crowd-sourcing, and the stability of predictions across approaches, suggest that the proposed methodology is a promising avenue for studying urban issues.

Suggested Citation

Galdo, Virgilio and Li, Yue and Rama, Martin, Identifying Urban Areas by Combining Human Judgment and Machine Learning: An Application to India (February 24, 2020). World Bank Policy Research Working Paper No. 0160, Available at SSRN: https://ssrn.com/abstract=3543805

Virgilio Galdo (Contact Author)

Michigan State University - Economics ( email )

Agriculture Hall
East Lansing, MI 48824-1122
United States
517-203-8372 (Phone)

HOME PAGE: http://search.msu.edu/people/index.php?uid=488171&prev=Galdo,%20Virgilio

Yue Li

World Bank ( email )

1818 H Street, NW
Washington, DC 20433
United States

Martin Rama

World Bank ( email )

1818 H. Street, N.W.
Washington, DC 20433
United States

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Downloads
30
Abstract Views
545
PlumX Metrics