k-Anonymity via Clustering Domain Knowledge for Privacy Preservation

Taiyong Li,Changjie Tang,Jiang Wu,Qian Luo,Shengzhi Li,Xun Lin,J. Zuo
DOI: https://doi.org/10.1109/FSKD.2008.428
2008-10-18
Abstract:Preservation of privacy in micro-data release is a challenging task in data mining. The k-anonymity method has attracted much attention of researchers. Quasi-identifier is a key concept in k-anonymity. The tuples whose quasi-identifiers have near effect on the sensitive attributes should be grouped to reduce information loss. The previous investigations ignored this point. This paper studies k-anonymity via clustering domain knowledge. The contributions include: (a) Constructing a weighted matrix based on domain knowledge and proposing measure methods. It carefully considers the effect between the quasi-identifiers and the sensitive attributes. (b) Developing a heuristic algorithm to achieve k-anonymity via clustering domain knowledge based on the measure methods. (c) Implementing the algorithm for privacy preservation, and (d) Experiments on real data demonstrate that the proposed k-anonymous methods decrease 30% information loss compared with basic k-anonymity.
What problem does this paper attempt to address?