Differential Privacy Overview and Fundamental Techniques

Ferdinando Fioretto,Pascal Van Hentenryck,Juba Ziani
2024-11-07
Abstract:This chapter is meant to be part of the book "Differential Privacy in Artificial Intelligence: From Theory to Practice" and provides an introduction to Differential Privacy. It starts by illustrating various attempts to protect data privacy, emphasizing where and why they failed, and providing the key desiderata of a robust privacy definition. It then defines the key actors, tasks, and scopes that make up the domain of privacy-preserving data analysis. Following that, it formalizes the definition of Differential Privacy and its inherent properties, including composition, post-processing immunity, and group privacy. The chapter also reviews the basic techniques and mechanisms commonly used to implement Differential Privacy in its pure and approximate forms.
Cryptography and Security,Artificial Intelligence
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the challenges in data privacy protection, especially the deficiencies of traditional privacy protection methods (such as data anonymization and k - anonymization). Specifically: 1. **Limitations of Traditional Privacy Protection Methods**: - **Data Anonymization**: Although explicit personal identifiers are removed, individuals can still be re - identified by combining with external public data. For example, Latanya Sweeney successfully re - identified the personal medical records of the then - Massachusetts governor from anonymized medical data by cross - referencing voter records. - **k - Anonymization**: Although it prevents the direct identification of specific individuals, there is still a risk of sensitive information leakage, especially in the case of multi - dataset combinations. In addition, k - anonymization lacks group privacy and composability, resulting in the failure of privacy protection during multiple queries or data aggregations. 2. **Key Requirements for the Definition of Privacy**: - **Composability**: When the privacy protection mechanism is applied multiple times, its protection effect should gradually weaken rather than completely fail. - **Post - processing Immunity**: Once the data is processed by the privacy protection mechanism, subsequent data analysis should not reduce its privacy protection level. - **Group Privacy**: Protection should not be limited to a single individual, but also consider groups composed of multiple individuals. - **Quantifiable Privacy - Accuracy Trade - off**: Provide a clear trade - off between privacy and data accuracy so that data analysts and decision - makers can balance according to specific needs. 3. **The Proposal of Differential Privacy**: - **Differential Privacy (DP)**: As a new privacy protection framework, it provides a formal and mathematical privacy definition and can provide precise and provable privacy guarantees. The core idea of differential privacy is to introduce random noise during the data release process to protect the privacy of individuals while maintaining the statistical characteristics of the data. In summary, this paper aims to introduce the concept of differential privacy and its basic techniques, explain why traditional privacy protection methods fail, and explore how differential privacy overcomes these limitations to provide a more powerful solution for modern data privacy protection.