Abstract:Adversarial training (AT) refers to integrating adversarial examples -- inputs altered with imperceptible perturbations that can significantly impact model predictions -- into the training process. Recent studies have demonstrated the effectiveness of AT in improving the robustness of deep neural networks against diverse adversarial attacks. However, a comprehensive overview of these developments is still missing. This survey addresses this gap by reviewing a broad range of recent and representative studies. Specifically, we first describe the implementation procedures and practical applications of AT, followed by a comprehensive review of AT techniques from three perspectives: data enhancement, network design, and training configurations. Lastly, we discuss common challenges in AT and propose several promising directions for future research.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is: **A review of the roles and methods of Adversarial Training (AT) in improving the robustness of deep neural networks against various adversarial attacks**. Specifically, the paper aims to fill the gap in the existing literature regarding a comprehensive overview of adversarial training techniques and provide a comprehensive perspective by reviewing a wide range of the latest research. ### Specific description of the problem 1. **Impact of adversarial samples**: Adversarial samples are samples generated by making small and imperceptible perturbations to the input data, and these perturbations can significantly affect the prediction results of the model. For example, after a small perturbation is made to an image, it may cause a classification model to misclassify it into a completely different category (as shown in Figure 1). 2. **Effectiveness of adversarial training**: Recent research has shown that adversarial training can effectively improve the robustness of deep neural networks against multiple adversarial attacks. However, there is currently a lack of a comprehensive review of these developments, making it difficult for researchers to systematically understand and apply these techniques. 3. **Deficiencies in existing literature**: Although adversarial training has been widely used in multiple fields (such as medical image segmentation, autonomous driving, anomaly detection, etc.), the existing literature fails to provide a comprehensive framework to summarize and classify these techniques. ### Goals of the paper To弥补 this deficiency, the paper conducts a comprehensive review of adversarial training in the following aspects: - **Implementation process and practical applications**: Describe the specific implementation steps of adversarial training and its applications in practice. - **Technical classification**: Classify and comment on adversarial training techniques in detail from three perspectives (data augmentation, network design, training configuration). - **Challenges and future directions**: Discuss common challenges in adversarial training and propose potential directions for future research. ### Formula representation Adversarial training is usually formulated as a min - max optimization problem: \[ \min_{\theta} \mathbb{E}_{(x,y) \sim D} \left[ \max_{\delta \in B(x,\epsilon)} \ell(x+\delta, y; \theta) \right] \] where: - $\theta$ represents model parameters, - $(x, y)$ represents training data sampled from the data distribution $D$, - $\ell(x+\delta, y; \theta)$ represents the loss value calculated using the adversarial sample $x + \delta$ and its true label $y$, - $\delta$ represents the adversarial perturbation, which is a small perturbation that is imperceptible to humans but can significantly degrade the model performance, - $B(x, \epsilon)$ is the set of allowed perturbations, defined as: \[ B(x, \epsilon) = \{\delta | x+\delta \in [0,1], \|\delta\|_p \leq \epsilon\} \] where $\epsilon$ is the maximum perturbation magnitude, and $\|\delta\|_p$ quantifies the perturbation size using the $p$-norm, and all pixels are normalized to the range $[0,1]$. ### Conclusion By systematically summarizing and classifying adversarial training techniques, this paper provides a comprehensive reference framework for researchers and points out potential directions for future research. This helps promote the development and application of adversarial training techniques, especially in improving the robustness and security of deep learning models.

Adversarial Training: A Survey

Recent Advances in Adversarial Training for Adversarial Robustness

A survey of robust adversarial training in pattern recognition: Fundamental, theory, and methodologies

Adversarial Examples based on Object Detection tasks: A Survey

Adversarial Examples on Object Recognition: A Comprehensive Survey

Bag of Tricks for Adversarial Training

Blind Adversarial Training: Towards Comprehensively Robust Models Against Blind Adversarial Attacks.

Enhancing Robust Representation in Adversarial Training: Alignment and Exclusion Criteria

Adversarial Attacks and Defenses in Machine Learning-Powered Networks: A Contemporary Survey

On the Effectiveness of Adversarial Training Against Backdoor Attacks.

Adversarial Training via Adaptive Knowledge Amalgamation of an Ensemble of Teachers

Adversarial Attacks and Defences: A Survey

Strength-Adaptive Adversarial Training

Successful Daptomycin Treatment for Staphylococcus lugdunensis Endocarditis

Adversarial Attacks of Vision Tasks in the Past 10 Years: A Survey

Adversarial Attack and Defense: A Survey

Enhancing Adversarial Robustness through Stable Adversarial Training

Improving the Generalization of Adversarial Training with Domain Adaptation

Adversarial Distributional Training for Robust Deep Learning

A Survey of Practical Adversarial Example Attacks