Clustering-Based k-anonymity

Xianmang He,HuaHui Chen,Yefang Chen,Yihong Dong,Peng Wang,Zhenhua Huang
DOI: https://doi.org/10.1007/978-3-642-30217-6_34
2012-01-01
Abstract:Privacy is one of major concerns when data containing sensitive information needs to be released for ad hoc analysis, which has attracted wide research interest on privacy-preserving data publishing in the past few years. One approach of strategy to anonymize data is generalization. In a typical generalization approach, tuples in a table was first divided into many QI (quasi-identifier)-groups such that the size of each QI-group is no less than k . Clustering is to partition the tuples into many clusters such that the points within a cluster are more similar to each other than points in different clusters. The two methods share a common feature: distribute the tuples into many small groups. Motivated by this observation, we propose a clustering-based k -anonymity algorithm, which achieves k -anonymity through clustering. Extensive experiments on real data sets are also conducted, showing that the utility has been improved by our approach.
What problem does this paper attempt to address?