Correlation Analysis for Key-Value Data with Local Differential Privacy

SUN Lin,PING Guo-lou,YE Xiao-jun
DOI: https://doi.org/10.11896/jsjkx.201200122
2021-01-01
Abstract:Crowdsourced data from distributed sources are routinely collected and analyzed to produce effective data-mining mo-dels in crowdsensing systems.Data usually contains personal information,which leads to possible privacy leakage in data collection and analysis.The local differential privacy (LDP) has been deemed as the de facto measure for trade-off between privacy guarantee and data utility.Currently,the key-value data is a kind of heterogeneous data types in which the key is categorical data and the value is numerical data.Achieving LDP for key-value data is challenging.This paper focuses on key-value data publishing and correlation analysis under the framework of LDP.Firstly,the frequency correlation and mean correlation in key-value data are defined.Then the indexing one-hot perturbation mechanism is proposed to provide LDP guarantees.At last,the correlation results can be estimated in the perturbed space.Theoretical analysis and experimental results on both real-word and synthetic dataset va-lidate the effectiveness of proposed mechanism.
What problem does this paper attempt to address?