Privacy-preserving data publishing: an information-driven distributed genetic algorithm

Yong-Feng Ge,Hua Wang,Jinli Cao,Yanchun Zhang,Xiaohong Jiang
DOI: https://doi.org/10.1007/s11280-024-01241-y
2024-01-17
World Wide Web
Abstract:The privacy-preserving data publishing (PPDP) problem has gained substantial attention from research communities, industries, and governments due to the increasing requirements for data publishing and concerns about data privacy. However, achieving a balance between preserving privacy and maintaining data quality remains a challenging task in PPDP. This paper presents an information-driven distributed genetic algorithm (ID-DGA) that aims to achieve optimal anonymization through attribute generalization and record suppression. The proposed algorithm incorporates various components, including an information-driven crossover operator, an information-driven mutation operator, an information-driven improvement operator, and a two-dimensional selection operator. Furthermore, a distributed population model is utilized to improve population diversity while reducing the running time. Experimental results confirm the superiority of ID-DGA in terms of solution accuracy, convergence speed, and the effectiveness of all the proposed components.
computer science, information systems, software engineering
What problem does this paper attempt to address?