Cross-Bucket Generalization for Information and Privacy Preservation.

Boyu Li,Yanheng Liu,Xu Han,Jindong Zhang
DOI: https://doi.org/10.1109/tkde.2017.2773069
IF: 9.235
2017-01-01
IEEE Transactions on Knowledge and Data Engineering
Abstract:Generalization is an effective technique for protecting confidential information of individuals, and has been studied by proposing numerous algorithms. However, the previous works do not separate the protection against identity disclosure and sensitive disclosure. Thus, when the requirement of attribute protection is higher than that of identity protection, generalization for l-diversity causes overprotection for identity and large mounts of information utility loss. This paper presents a novel approach, called cross-bucket generalization, as a solution to meet the problem. The rationale is to divide microdata into equivalence groups and buckets. First, it provides separate protection for identity and sensitive values, and the level of protection can be flexibly adjusted based on actual demands. Second, the sizes of equivalence groups and buckets are minimized as far as possible by only satisfying the protection requirements, which avoid the overprotection for identity and reduce information loss. The experiments we conducted illustrate the effectiveness of our solution.
What problem does this paper attempt to address?