Iatrogenic Specification Error: A Cautionary Tale of Cleaning Data
24 Pages Posted: 10 Apr 2004
There are 2 versions of this paper
Iatrogenic Specification Error: A Cautionary Tale of Cleaning Data
Iatrogenic Specification Error: A Cautionary Tale of Cleaning Data
Date Written: March 2004
Abstract
In empirical research it is common practice to use sensible rules of thumb for cleaning data. Measurement error is often the justification for removing (trimming) or recoding (winsorizing) observations whose values lie outside a specified range. We consider a general measurement error process that nests many plausible models. Analytic results demonstrate that winsorizing and trimming are only solutions for a narrow class of measurement error processes. Indeed, for the measurement error processes found in most social-science data, such procedures can induce or exacerbate bias, and even inflate the variance estimates. We term this source of bias "Iatrogenic" (or econometrician induced) error. Monte Carlo simulations and empirical results from the Census PUMS data and 2001 CPS data demonstrate the fragility of trimming and winsorizing as solutions to measurement error in the dependent variable. Even on asymptotic variance and RMSE criteria, we are unable to find generalizable justifications for commonly used cleaning procedures.
Keywords: measurement error models, trimming, winsorizing
JEL Classification: C1, J1
Suggested Citation: Suggested Citation
Do you have negative results from your research you’d like to share?
Recommended Papers
-
What Went Wrong? The Erosion of Relative Earnings and Employment Among Young Black Men in the 1980s
By John Bound and Richard B. Freeman
-
By James J. Heckman and Brook S. Payner
-
By Harry J. Holzer and David Neumark
-
The Government's Impact on the Labor Market Status of Black Americans: A Critical Review
-
The Spatial Mismatch Hypothesis: are There Teenage Jobs Missing in the Ghetto?