Nrat: towards adversarial training with inherent label noise

Zhen Chen,Fu Wang,Ronghui Mu,Peipei Xu,Xiaowei Huang,Wenjie Ruan
DOI: https://doi.org/10.1007/s10994-023-06437-3
IF: 5.414
2024-01-12
Machine Learning
Abstract:Adversarial training (AT) has been widely recognized as the most effective defense approach against adversarial attacks on deep neural networks and it is formulated as a min-max optimization. Most AT algorithms are geared towards research-oriented datasets such as MNIST, CIFAR10, etc., where the labels are generally correct. However, noisy labels, e.g., mislabelling, are inevitable in real-world datasets. In this paper, we investigate AT with inherent label noise, where the training dataset itself contains mislabeled samples. We first empirically show that the performance of AT typically degrades as the label noise rate increases. Then, we propose a Noisy-Robust Adversarial Training (NRAT) algorithm, which leverages the recent advancements in learning with noisy labels to enhance the performance of AT in the presence of label noise. For experimental comparison, we consider two essential metrics in AT: (i) trade-off between natural and robust accuracy; (ii) robust overfitting. Our experiments show that NRAT's performance is on par with, or better than, the state-of-the-art AT methods on both evaluation metrics. Our code is publicly available at: https://github.com/TrustAI/NRAT.
computer science, artificial intelligence
What problem does this paper attempt to address?