Enhancing Feature-Specific Data Protection via Bayesian Coordinate Differential Privacy

Maryam Aliakbarpour,Syomantak Chaudhuri,Thomas A. Courtade,Alireza Fallah,Michael I. Jordan
2024-10-24
Abstract:Local Differential Privacy (LDP) offers strong privacy guarantees without requiring users to trust external parties. However, LDP applies uniform protection to all data features, including less sensitive ones, which degrades performance of downstream tasks. To overcome this limitation, we propose a Bayesian framework, Bayesian Coordinate Differential Privacy (BCDP), that enables feature-specific privacy quantification. This more nuanced approach complements LDP by adjusting privacy protection according to the sensitivity of each feature, enabling improved performance of downstream tasks without compromising privacy. We characterize the properties of BCDP and articulate its connections with standard non-Bayesian privacy frameworks. We further apply our BCDP framework to the problems of private mean estimation and ordinary least-squares regression. The BCDP-based approach obtains improved accuracy compared to a purely LDP-based approach, without compromising on privacy.
Machine Learning,Cryptography and Security
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to improve the performance of machine - learning tasks while protecting data privacy. Specifically, although the existing local differential privacy (LDP) framework provides strong privacy protection, it applies a uniform protection level to all data features, even for less sensitive data features. This practice leads to a decline in the performance of downstream tasks. To overcome this limitation, the authors propose a new method based on the Bayesian framework - Bayesian Coordinate Differential Privacy (BCDP) - to achieve privacy quantification for specific data features. By adjusting the privacy protection level of each feature, the BCDP method can improve the performance of downstream tasks without sacrificing privacy. ### Main Contributions 1. **Proposing the BCDP Framework**: Starting from the Bayesian perspective, the authors propose a new privacy framework - Bayesian Coordinate Differential Privacy (BCDP). BCDP complements LDP and allows for custom - made privacy requirements for each feature. BCDP ensures that the probability ratio of inferring a specific feature (or coordinate) is almost unchanged before and after observing the output of the private mechanism, thereby achieving different privacy protection levels for different features. 2. **Formalizing Relationships and Properties**: The authors explore the formal relationships between BCDP and other related differential privacy concepts and study standard differential privacy properties, such as post - processing and composability. 3. **Application Examples**: To verify the effectiveness of the BCDP framework, the authors study two fundamental machine - learning problems: multivariate mean estimation and ordinary least - squares regression. In these two problems, the algorithms proposed by the authors outperform the standard algorithms using only LDP while satisfying the BCDP constraints. ### Background and Motivation With the wide application of machine learning in various fields, the demand for data collection has increased dramatically, and user privacy issues have become increasingly prominent. Differential privacy (DP) is an important privacy - protection framework that can protect the privacy of individuals while using data. Local differential privacy (LDP) is suitable for distributed settings, and users do not need to trust any central authority. However, LDP applies a uniform protection level to all data features, which is particularly disadvantageous in high - dimensional data because different features have different sensitivities. Therefore, how to adjust the privacy protection level according to the specific sensitivities of data features has become an urgent problem to be solved. ### Technical Details 1. **Privacy Protection from the Bayesian Perspective**: In the Bayesian framework, the measure of privacy protection is the degree of change in the probability ratio of an adversary guessing an event after observing the output of the private mechanism. BCDP achieves feature - specific privacy protection by limiting the adversary's ability to infer specific features after observing the output. 2. **Multivariate Mean Estimation**: The authors propose a new algorithm that improves the accuracy of multivariate mean estimation while satisfying the BCDP constraints. Experimental results show that this algorithm outperforms the standard algorithm using only LDP. 3. **Ordinary Least - Squares Regression**: The authors also propose an algorithm for ordinary least - squares regression under LDP and BCDP constraints. This algorithm obtains the most accurate estimate by querying the LDP mechanism in parallel and selecting appropriate privacy parameters and finally aggregating the output. ### Conclusion By introducing the BCDP framework, the authors have successfully solved the problem of improving the performance of machine - learning tasks while protecting privacy. BCDP can not only provide customized privacy protection according to the sensitivities of data features but also significantly improve the performance of downstream tasks without sacrificing privacy. This framework provides a new direction for future research, especially in high - dimensional data processing and privacy protection.