Towards Correlated Data Trading for High-Dimensional Private Data

Hui Cai,Yuanyuan Yang,Weibei Fan,Fu Xiao,Yanmin Zhu
DOI: https://doi.org/10.1109/tpds.2023.3237691
IF: 5.3
2023-01-01
IEEE Transactions on Parallel and Distributed Systems
Abstract:The commoditization of private data has become an attractive research topic with the emergence of Big Data era. In this paper, we study the trading of high-dimensional private data with differential privacy guarantee. We propose Cheap , which is a novel Correlated data trading framework for High-dimEnsionAl Private data. Cheap first models data correlations among high-dimensional user attributes, and builds an initial attribute clustering scheme. Combined with this scheme, Cheap devises a novel data perturbation mechanism by solving optimal attribute clustering ( OAC ) problem, in order to improve data utility of traded data and further generate a privacy-preserving high-dimensional dataset with close joint distribution with the original one. It then quantifies privacy loss based on near-optimal attribute cluster scheme due to the NP-hardness of the OAC problem, and further compensates data owners by running auction in a cost-effective way. We evaluate the performance of Cheap on UserBehavior dataset and Obesity dataset, respectively. Our evaluation and analysis demonstrate that Cheap well balances data utility and privacy protection, and achieves all desired economic properties of budget balance, individual rationality and truthfulness.
What problem does this paper attempt to address?