Abstract:Adversarial robustness, the ability of a model to withstand manipulated inputs that cause errors, is essential for ensuring the trustworthiness of machine learning models in real-world applications. However, previous studies have shown that enhancing adversarial robustness through adversarial training increases vulnerability to privacy attacks. While differential privacy can mitigate these attacks, it often compromises robustness against both natural and adversarial samples. Our analysis reveals that differential privacy disproportionately impacts low-risk samples, causing an unintended performance drop. To address this, we propose DeMem, which selectively targets high-risk samples, achieving a better balance between privacy protection and model robustness. DeMem is versatile and can be seamlessly integrated into various adversarial training techniques. Extensive evaluations across multiple training methods and datasets demonstrate that DeMem significantly reduces privacy leakage while maintaining robustness against both natural and adversarial samples. These results confirm DeMem's effectiveness and broad applicability in enhancing privacy without compromising robustness.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is how to avoid increasing the risk of privacy leakage while enhancing the adversarial robustness of machine - learning models. Specifically, the paper focuses on how to balance the robustness and privacy protection of the model when applying Differential Privacy (DP) and Adversarial Training. ### Problem Background 1. **Adversarial Robustness**: To ensure the reliability of machine - learning models in the face of adversarial samples, researchers have proposed methods such as adversarial training to improve the robustness of the model. However, these methods may make the model more vulnerable to privacy attacks, especially Membership Inference Attacks (MIA). 2. **Privacy Protection**: Differential privacy is an effective privacy - protection technique, but it usually reduces the robustness and overall performance of the model, especially when dealing with natural samples and adversarial samples. Research shows that DP has a particularly significant impact on low - risk samples, leading to a decline in model performance. ### Core Problem of the Paper The paper finds through analysis that differential privacy has a greater impact on low - risk samples (i.e., those that have a smaller impact on the model output), and these samples are crucial for maintaining the overall performance of the model. Therefore, directly applying differential privacy will lead to a significant decline in model performance. To solve this problem, the paper proposes a new method - DeMem (Dememorization), which aims to enhance privacy protection by selectively targeting high - risk samples (i.e., those that have a greater impact on the model output) while minimizing the impact on model performance. ### Main Contributions 1. **In - depth Analysis**: For the first time, the paper analyzes from the perspective of the memorization score of samples why differential privacy significantly reduces model performance. 2. **Revealing the Problem**: The research finds that the impact of differential privacy on low - risk samples is disproportionate, thus explaining the decline in model robustness. 3. **Proposing a Solution**: The DeMem method is proposed. By selectively restricting the memorization of high - risk samples, privacy protection is enhanced while the impact on model performance is reduced. ### Conclusion The paper verifies the effectiveness of DeMem through extensive experiments, proving that it can significantly reduce privacy leakage without sacrificing robustness. This provides new ideas and methods for achieving a better balance between privacy and robustness in practical applications. ### Formula Representation - **Memorization Score**: \[ \text{mem}(A, D_{\text{tr}}, i)=\Pr_{h \leftarrow A(D_{\text{tr}})}[h(x_i) = y_i]-\Pr_{h \leftarrow A(D_{\text{tr}}\setminus \{i\})}[h(x_i) = y_i] \] - **Sample - wise Dememorization Penalty**: \[ \Psi(B)=\frac{1}{N}\sum_{i = 1}^N\left(\ell(x_i,\theta)-\frac{1}{N}\sum_{j = 1}^N\ell(x_j,\theta)\right)^2 \] - **Total Loss Function**: \[ L_{\text{total}}(\theta)=L(\theta)+\lambda\cdot\Psi(B) \] where \(L(\theta)\) is the original loss function and \(\lambda\) is a hyper - parameter that controls the intensity of the dememorization penalty. Through these formulas, the paper provides a specific framework to implement the DeMem method and shows its effectiveness on multiple data sets and model structures.

DeMem: Privacy-Enhanced Robust Adversarial Learning via De-Memorization

D-DAE: Defense-Penetrating Model Extraction Attacks.

Adversarial for Good – Defending Training Data Privacy with Adversarial Attack Wisdom

Differentially Private Optimizers Can Learn Adversarially Robust Models

DP-MemArc: Differential Privacy Transfer Learning for Memory Efficient Language Models

On the Privacy Effect of Data Enhancement Via the Lens of Memorization

DPMLBench: Holistic Evaluation of Differentially Private Machine Learning

Differentially-Private Deep Learning from an Optimization Perspective.

Differentially Private and Adversarially Robust Machine Learning: An Empirical Evaluation

Adversarial Robust Memory-Based Continual Learner

Differentially Private Deep Learning with Smooth Sensitivity

AMOUE: Adaptive modified optimized unary encoding method for local differential privacy data preservation

Robustness, Privacy, and Generalization of Adversarial Training

Stable Unlearnable Example: Enhancing the Robustness of Unlearnable Examples via Stable Error-Minimizing Noise

Balancing Privacy and Robustness in Prompt Learning for Large Language Models

Discriminative Adversarial Privacy: Balancing Accuracy and Membership Privacy in Neural Networks

Privacy Risks of Securing Machine Learning Models against Adversarial Examples

Not Just Cloud Privacy: Protecting Client Privacy in Teacher-Student Learning

When Deep Learning Meets Differential Privacy: Privacy,Security, and More

An Adaptive and Fast Convergent Approach to Differentially Private Deep Learning

Fine-Tuning Language Models with Differential Privacy through Adaptive Noise Allocation