Data Privacy Quantification and De-identification Model Based on Information Theory

Zeyu Zhang,Zhiyang Lu,Youliang Tian
DOI: https://doi.org/10.1109/nana.2019.00046
2019-01-01
Abstract:De-identification enables to protect privacy of data from different attacks. At present, the specific de-identification standards and privacy quantification methods are given in many models, such as K-anonymity model and differential privacy model. But, the K-anonymity model does not provide an effective method to prove its degree of privacy protection, and when the model parameters change, the degree of privacy protection cannot be quantified. And due to rigorous calculation method of the differential privacy model, which quantifies the degree of privacy protection based on a mathematical basis, it is difficult to be used widely in organizations and institutions. So, this paper proposes a de-identification model, which includes quantification solution of identifying the sensitivity of the personal information, de-identification approach for adaptively matching different standard de-identification methods to personal information with different sensitivities and de-identification effect detection function. The objective is to provide an automatic, efficient and widely used de-identification model, which is built by quantitatively analyzing degree of privacy protection based on conditional entropy. Finally, performance analysis results show that the model makes the trade-off between privacy and secure of data and the widespread use of data.
What problem does this paper attempt to address?