T-Closeness Slicing: A New Privacy Preserving Approach for Transactional Data Publishing
INFORMS Journal on Computing, Forthcoming
42 Pages Posted: 7 Feb 2018
Date Written: January 29, 2018
Abstract
Privacy-preserving data publishing has received much attention in recent years. Prior studies have developed various algorithms such as generalization, anatomy, and L-diversity slicing to protect individuals’ privacy when transactional data are published for public use. These existing algorithms, however, all have certain limitations. For instance, generalization protects identity privacy well, but loses a considerable amount of information. Anatomy prevents attribute disclosure and lowers information loss, but fails to protect membership privacy. The more recent probability L-diversity slicing algorithm overcomes some shortcomings of generalization and anatomy, but cannot shield data from more subtle types of attacks such as skewness attack and similarity attack. To meet the demand of data owners with high privacy-preserving requirement, this study develops a novel method named t-closeness slicing (TCS) to better protect transactional data against various attacks. The time complexity of TCS is O(nlogn), where n is the number of records in the dataset, hence the algorithm scales well with large data. We conduct experiments using three transactional datasets and find that TCS not only effectively protects membership privacy, identity privacy, and attribute privacy, but also preserves better data utility than benchmarking algorithms.
Keywords: transactional data; privacy preservation; data publishing; t-closeness model; slicing technique
Suggested Citation: Suggested Citation