Local Differential Privacy with K-anonymous for Frequency Estimation

Dan Zhao,Hong Chen,Suyun Zhao,Cuiping Li,Xiaoying Zhang,Ruixuan Liu
DOI: https://doi.org/10.1109/bigdata47090.2019.9006022
2019-01-01
Abstract:Data release, such as statistics of data distribution, in many data analysis and machine learning tasks is needed, which poses significant risks of user's privacy. Usually, to preserve privacy of every individual, frequency estimation based on LDP (Local Differential Privacy) is used to replace the real distribution of data. Unfortunately, when an individual sends values multiple times, privacy leakage, i.e., same value problems may occur, along with other performance problems such as memory usage problem. To narrow these gaps, SAnonLDP (Sample Anonymous Local Differential Privacy) is proposed in this paper. We build the SAnonLDP framework by integrating k-anonymous into LDP, which includes four blocks: random grouping; anonymous and Walsh-Fourier transforms; random response; singular value decomposition (SVD). Among them, the second block 'Anonymous and Walsh-Fourier transforms' significantly decreases the communication cost and the memory requirements. The left blocks make up for the loss of information to achieve an acceptable frequency estimation. More important, we verify that this estimation is unbiased by the strict mathematical reasoning. Finally, the numerical experiments demonstrate that SAnonLAP achieves better KL-divergence and estimation error compared to another known privacy model: RAPPOR.
What problem does this paper attempt to address?