Using Partially Synthetic Microdata to Protect Sensitive Cells in Business Statistics

29 Pages Posted: 10 Feb 2016

See all articles by Javier Miranda

Javier Miranda

US Census Bureau — Economy-Wide Statistics Division

Lars Vilhuber

Cornell University - Department of Economics; U.S. Census Bureau - Center for Economic Studies

Date Written: February 01, 2016

Abstract

We describe and analyze a method that blends records from both observed and synthetic microdata into public-use tabulations on establishment statistics. The resulting tables use synthetic data only in potentially sensitive cells. We describe different algorithms, and present preliminary results when applied to the Census Bureau's Business Dynamics Statistics and Synthetic Longitudinal Business Database, highlighting accuracy and protection afforded by the method when compared to existing public-use tabulations (with suppressions).

Keywords: synthetic data, statistical disclosure limitation, time-series, local labor markets, gross job flows, confidentiality protection

Suggested Citation

Miranda, Javier and Vilhuber, Lars, Using Partially Synthetic Microdata to Protect Sensitive Cells in Business Statistics (February 01, 2016). US Census Bureau Center for Economic Studies Paper No. CES-WP- 16-10, Available at SSRN: https://ssrn.com/abstract=2729428 or http://dx.doi.org/10.2139/ssrn.2729428

Javier Miranda (Contact Author)

US Census Bureau — Economy-Wide Statistics Division ( email )

Washington, DC
United States

Lars Vilhuber

Cornell University - Department of Economics ( email )

Ithaca, NY
United States

U.S. Census Bureau - Center for Economic Studies ( email )

4700 Silver Hill Road
Washington, DC 20233
United States

Here is the Coronavirus
related research on SSRN

Paper statistics

Downloads
12
Abstract Views
306
PlumX Metrics