ABSyn: An Accurate Differentially Private Data Synthesis Scheme With Adaptive Selection and Batch Processes
Jingyu Jia,Xinhao Li,Tong Li,Zhewei Liu,Chang Tan,Siyi Lv,Liang Guo,Changyu Dong,Zheli Liu
DOI: https://doi.org/10.1109/tifs.2024.3453175
IF: 7.231
2024-09-20
IEEE Transactions on Information Forensics and Security
Abstract:In private data publishing, a promising solution is generating synthetic data that enables any query on the private dataset while satisfying differential privacy. Over the past decade, researchers mainly focused on improving the query accuracy of synthetic data. However, the limitations of existing works restrict them from achieving a better trade-off between accuracy and privacy. In this paper, we propose ABSyn, a novel scheme for differentially private data synthesis. Under the Select-Measure-Generate paradigm, ABSyn has an adaptive mechanism for precisely selecting marginals and follows the batch processes. Our adaptive-batch scheme can provide a well-selected marginal set and the optimal allocation of privacy budget, which makes its synthetic data achieve high accuracy without compromising privacy. We implement an efficient prototype of ABSyn and compare it with existing works by analyzing public datasets. Experimental results show that ABSyn achieves query accuracy on synthetic datasets by a factor of and efficiency by a factor of over the state-of-the-art scheme on average.
computer science, theory & methods,engineering, electrical & electronic