Top-k Frequent Itemsets Publication of Uncertain Data Based on Differential Privacy

Yunfeng Zou,Xiaoyi Bao,Chao Xu,Weiwei Ni
DOI: https://doi.org/10.1007/978-3-030-60029-7_49
2020-01-01
Abstract:Privacy preserving frequent itemset mining (PPFIM) on uncertain data is booming with the increasing attention to data privacy. Existing methods use a filter function satisfies exponential differential privacy to obtain top-k frequent itemsets, and add Laplace noise to their supports to achieve the privacy protection release of top-k frequent itemsets. The privacy protection mechanism is independent of the mining process, resulting in the accuracy of the top-k frequent itemsets being affected by the k value. When the algorithm is applied to large-scale frequent itemsets, balance the data availability and privacy security is difficult. In view of the above deficiencies, the privacy protection mechanism and the mining process are integrated to design a PPFIM strategy, achieve the separation of noise addition and top-k filtering, and avoid the dependence of the algorithm accuracy on the k value. A candidate level information extraction strategy is designed to reduce the search space and effectively reduce the privacy budget by utilizing the feature of upper limit threshold in uncertain data sets. On this basis, a novel algorithm Uncertain difference privacy level frequent itemset mining (UDP-LFIM) is proposed. Theoretical analysis and experimental demonstrate that the top-k itemsets published by the algorithm can guarantee the accuracy on the premise of satisfying differential privacy.
What problem does this paper attempt to address?