Abstract:ML models are ubiquitous in real world applications and are a constant focus of research. At the same time, the community has started to realize the importance of protecting the privacy of ML training data. Differential Privacy (DP) has become a gold standard for making formal statements about data anonymization. However, while some adoption of DP has happened in industry, attempts to apply DP to real world complex ML models are still few and far between. The adoption of DP is hindered by limited practical guidance of what DP protection entails, what privacy guarantees to aim for, and the difficulty of achieving good privacy-utility-computation trade-offs for ML models. Tricks for tuning and maximizing performance are scattered among papers or stored in the heads of practitioners. Furthermore, the literature seems to present conflicting evidence on how and whether to apply architectural adjustments and which components are "safe" to use with DP. This work is a self-contained guide that gives an in-depth overview of the field of DP ML and presents information about achieving the best possible DP ML model with rigorous privacy guarantees. Our target audience is both researchers and practitioners. Researchers interested in DP for ML will benefit from a clear overview of current advances and areas for improvement. We include theory-focused sections that highlight important topics such as privacy accounting and its assumptions, and convergence. For a practitioner, we provide a background in DP theory and a clear step-by-step guide for choosing an appropriate privacy definition and approach, implementing DP training, potentially updating the model architecture, and tuning hyperparameters. For both researchers and practitioners, consistently and fully reporting privacy guarantees is critical, and so we propose a set of specific best practices for stating guarantees.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is how to achieve differential privacy (DP) in machine learning (ML) models. Specifically, the paper aims to provide a comprehensive guide to help researchers and practitioners apply DP to complex ML models, such as deep neural networks, to ensure the privacy protection of model training data while maintaining the performance and practicality of the models. ### The main problems of the paper can be summarized as follows: 1. **Trade - off between privacy and utility**: - Modern ML models are becoming increasingly complex, and protecting the privacy of training data has become crucial. However, when applying DP to these complex models, there is often a conflict between privacy protection and model utility. - How to minimize the impact on model performance while ensuring privacy is an urgent problem to be solved. 2. **Lack of practical guidance**: - Although DP has been widely studied, in practical applications, there is still a lack of systematic guidance on how to choose appropriate privacy definitions, adjust model architectures, and optimize hyper - parameters. - The paper hopes to fill this gap by providing a self - contained guide, enabling researchers and practitioners to better understand and apply DP. 3. **Specific implementation of privacy protection**: - In practical applications, how to choose an appropriate privacy budget (ε value), how to calculate and report privacy guarantees, and how to handle challenges in specific scenarios (such as user - level privacy protection) are all practical problems that need to be solved. - The paper discusses these problems in detail and provides specific suggestions and methods. 4. **Hyper - parameter optimization and model architecture design**: - The selection of hyper - parameters and the design of model architectures have an important impact on the performance of DP - ML models. The paper explores how to perform effective hyper - parameter optimization within the DP framework and proposes methods for optimizing model architectures. 5. **Combination of theory and practice**: - In addition to practical operation guides, the paper also covers important theoretical foundations, such as privacy accounting, the convergence of the DP - SGD algorithm, etc., to help readers understand the working principle of DP more in - depth. ### Summary The core problem of the paper is: how to effectively apply differential privacy in complex ML models to ensure data privacy while maintaining the performance and practicality of the models. By providing detailed guides and theoretical support, the paper hopes to promote the deployment and use of DP in more practical application scenarios.

How to DP-fy ML: A Practical Guide to Machine Learning with Differential Privacy

DPMLBench: Holistic Evaluation of Differentially Private Machine Learning

Deep Learning with Differential Privacy

Not one but many Tradeoffs: Privacy Vs. Utility in Differentially Private Machine Learning

Privacy at a Price: Exploring its Dual Impact on AI Fairness

Evaluating Differentially Private Machine Learning in Practice

When Deep Learning Meets Differential Privacy: Privacy,Security, and More

A Critical Review on the Use (and Misuse) of Differential Privacy in Machine Learning

DPAdapter: Improving Differentially Private Deep Learning through Noise Tolerance Pre-training

Differential Privacy in Privacy-Preserving Big Data and Learning: Challenge and Opportunity

Differentially Private Natural Language Models: Recent Advances and Future Directions

Advancing Differential Privacy: Where We Are Now and Future Directions for Real-World Deployment

Optimal Differentially Private Model Training with Public Data

Scalable Differential Privacy Mechanisms for Real-Time Machine Learning Applications

Discriminative Adversarial Privacy: Balancing Accuracy and Membership Privacy in Neural Networks

Differential Privacy Made Easy

Belt and Brace: When Federated Learning Meets Differential Privacy

A Survey on Differential Privacy with Machine Learning and Future Outlook

Advances in Differential Privacy and Differentially Private Machine Learning

Differential Privacy for Deep and Federated Learning: A Survey

Medical imaging deep learning with differential privacy