Improving Data Utilization of K-anonymity Through Clustering Optimization.

Hewen Wang,Jingsha He,Nafei Zhu
2022-01-01
Abstract:K-anonymity privacy protection model demonstrates good performance in privacy pro-tection and, has been widely applied in such scenarios as data publishing, location-based services, and social networks. With the aim of ensuring k-anonymity to conform to the requirements of pri-vacy protection with improved data utilization, this study proposes a k-anonymity algorithm based on central point clustering, so as to improve the quality of clustering through optimizing the selection of cluster centroids, leading to the improvement in effectiveness and efficiency of k-anonymity. After clustering, the quasi-identifier attributes are aligned for classification and generalization, which is evaluated using appropriate information loss metrics. To measure the distance between records and between records and clusters, this study also establishes a definition of such distance that is positively correlated to the amount of information that is lost by combining the characteristics of the depth and width of the generalization hierarchy, in an effort to improve of the utility of the algorithm. The exper-imental results show that the proposed algorithm not only meets the basic anonymity requirements, but also improves data utilization compared with some prevailing algorithms.
What problem does this paper attempt to address?