Identification and Formal Privacy Guarantees

69 Pages Posted: 17 Jul 2020 Last revised: 4 May 2021

See all articles by Tatiana Komarova

Tatiana Komarova

London School of Economics & Political Science (LSE)

Denis Nekipelov

University of Virginia - Department of Economics

Date Written: April 25, 2021

Abstract

Empirical economic research crucially relies on highly sensitive individual datasets.
At the same time, increasing availability of public individual-level data that comes from social
networks, public government records and directories makes it possible for adversaries to poten-
tially de-identify anonymized records in sensitive research datasets. Most commonly accepted
formal defi nition of an individual non-disclosure guarantee is referred to as di fferential privacy.
With di fferential privacy in place the researcher interacts with the data by issuing queries that
evaluate the functions of the data. Di fferential privacy guarantee is achieved by replacing the
actual outcome of the query with a randomized outcome with the amount of randomness deter-
mined by the sensitivity of the outcome to individual observations in the data.

While di fferential privacy does provide formal non-disclosure guarantees, its impact on the
identi cation of empirical economic models as well as its impact on the performance of estima-
tors in nonlinear empirical Econometric models has not been suciently studied. Since privacy
protection mechanisms are inherently fi nite-sample procedures, we de fine the notion of iden-
ti fiability of the parameter of interest under di fferential privacy as a property of the limit of
experiments. It is naturally characterized by the concepts from the random sets theory and is
linked to the asymptotic behavior in measure of di fferentially private estimators.

We demonstrate that particular instances of regression discontinuity design may be problem-
atic for inference with di fferential privacy. Those parameters turn out to be neither point nor
partially identifi ed. The set of di fferentially private estimators converges weakly to a random
set. This result is clearly supported by our simulation evidence. Our analysis suggests that
many other estimators that rely on nuisance parameters may have similar properties with the
requirement of di fferential privacy. Identi cation becomes possible if the target parameter can
be deterministically localized within the random set. In that case, a full exploration of the ran-
dom set of the weak limits of di fferentially private estimators can allow the data curator to select
a sequence of instances of di fferentially private estimators that is guaranteed to converge to the
target parameter in probability. We provide a decision-theoretic approach to this selection.

Keywords: Differential privacy, average treatment effect, regression discontinuity,; random sets, identification

JEL Classification: C35, C14, C25, C13

Suggested Citation

Komarova, Tatiana and Nekipelov, Denis, Identification and Formal Privacy Guarantees (April 25, 2021). Available at SSRN: https://ssrn.com/abstract=3635824 or http://dx.doi.org/10.2139/ssrn.3635824

Tatiana Komarova (Contact Author)

London School of Economics & Political Science (LSE) ( email )

Houghton Street
London, WC2A 2AE
United Kingdom
+44 02078523707 (Phone)

HOME PAGE: http://personal.lse.ac.uk/komarova/

Denis Nekipelov

University of Virginia - Department of Economics ( email )

237 Monroe Hall
P.O. Box 400182
Charlottesville, VA 22904-418
United States

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Downloads
111
Abstract Views
510
rank
300,015
PlumX Metrics