Data Masking by Noise Addition and the Estimation of Nonparametric Regression Models
Journal of Economics and Statistics, Vol. 225, No. 5, pp. 517-528, 2005
Posted: 17 Jul 2006
Abstract
Data collecting institutions use a large range of masking procedures in order to protect data against disclosure. Generally, a masking procedure can be regarded as a kind of data filter that transforms the true data generating process. Such a transformation severely affects the quality of the data and limits its use for empirical research. A popular masking procedure is noise addition, which leads to inconsistent estimates if the additional measurement errors are ignored.
This paper investigates to what extent appropriate econometric techniques can obtain consistent estimates of the true data generating process for parametric and nonparametric models when data is masked by noise addition. We show how the reduction of the data quality can be minimized using the local polynomial Simulation-Extrapolation (SIMEX) estimator. Evidence is provided by a Monte-Carlo study and by an application to firm-level data, where we analyze the impact of innovative activity on employment.
Keywords: data masking, errors-in-variables, SIMEX, local polynomial regression
JEL Classification: C21, J24, J31
Suggested Citation: Suggested Citation