DPPro: Differentially Private High-Dimensional Data Release Via Random Projection

Chugui Xu,Ju Ren,Yaoxue Zhang,Zhan Qin,Kui Ren
DOI: https://doi.org/10.1109/tifs.2017.2737966
IF: 7.231
2017-01-01
IEEE Transactions on Information Forensics and Security
Abstract:Releasing representative data sets without compromising the data privacy has attracted increasing attention from the database community in recent years. Differential privacy is an influential privacy framework for data mining and data release without revealing sensitive information. However, existing solutions using differential privacy cannot effectively handle the release of high-dimensional data due to the increasing perturbation errors and computation complexity. To address the deficiency of existing solutions, we propose DPPro, a differentially private algorithm for high-dimensional data release via random projection to maximize utility while guaranteeing privacy. We theoretically prove that DPPro can generate synthetic data set with the similar squared Euclidean distance between high-dimensional vectors while achieving (epsilon, delta)-differential privacy. Based on the theoretical analysis, we observed that the utility guarantees of released data depend on the projection dimension and the variance of the noise. Extensive experimental results demonstrate that DPPro substantially outperforms several state-of-the-art solutions in terms of perturbation error and privacy budget on high-dimensional data sets.
What problem does this paper attempt to address?