Blind Adversarial Training: Towards Comprehensively Robust Models Against Blind Adversarial Attacks.

Haidong Xie,Xueshuang Xiang,Bin Dong,Naijin Liu
DOI: https://doi.org/10.1007/978-981-99-9119-8_2
2024-01-01
Abstract:Adversarial training (AT) aims to improve models' robustness against adversarial attacks by mixing clean data and adversarial examples (AEs) into training. Most existing AT approaches can be grouped into restricted and unrestricted approaches. Restricted AT requires a prescribed uniform budget for AEs during training, with the obtained results showing high sensitivity to the budget. In contrast, unrestricted AT uses unconstrained AEs, and these overestimated AEs significantly lower the clean accuracy and robustness against small budget attacks. Thus, the existing AT approaches find it difficult to obtain a comprehensively robust model when confronting attacks with an unknown budget, which we name blind adversarial attacks. Considering this problem, this paper proposes a novel AT approach named blind adversarial training (BAT). The main idea is to use a cutoff-scale strategy to adaptively estimate a nonuniform budget to modify the AEs used in training, ensuring that the strengths of the AEs are dynamically located in a reasonable range and ultimately improving the comprehensive robustness of the AT model. We include a theoretical investigation on a toy classification problem to guarantee the improvement of BAT. The experimental results also demonstrate that BAT can achieve better comprehensive robustness than AT with several AEs.
What problem does this paper attempt to address?