Boosting Adversarial Training via Fisher-Rao Norm-based Regularization

Xiangyu Yin,Wenjie Ruan
2024-03-26
Abstract:Adversarial training is extensively utilized to improve the adversarial robustness of deep neural networks. Yet, mitigating the degradation of standard generalization performance in adversarial-trained models remains an open problem. This paper attempts to resolve this issue through the lens of model complexity. First, We leverage the Fisher-Rao norm, a geometrically invariant metric for model complexity, to establish the non-trivial bounds of the Cross-Entropy Loss-based Rademacher complexity for a ReLU-activated Multi-Layer Perceptron. Then we generalize a complexity-related variable, which is sensitive to the changes in model width and the trade-off factors in adversarial training. Moreover, intensive empirical evidence validates that this variable highly correlates with the generalization gap of Cross-Entropy loss between adversarial-trained and standard-trained models, especially during the initial and final phases of the training process. Building upon this observation, we propose a novel regularization framework, called Logit-Oriented Adversarial Training (LOAT), which can mitigate the trade-off between robustness and accuracy while imposing only a negligible increase in computational overhead. Our extensive experiments demonstrate that the proposed regularization strategy can boost the performance of the prevalent adversarial training algorithms, including PGD-AT, TRADES, TRADES (LSE), MART, and DM-AT, across various network architectures. Our code will be available at
Machine Learning,Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is: **How to alleviate the problem of the decline in standard generalization performance in adversarial training**. Specifically, from the perspective of model complexity, the authors use the Fisher - Rao norm, a geometric invariant, to measure model complexity and propose a new regularization framework - Logit - Oriented Adversarial Training (LOAT), in order to improve the trade - off between the standard accuracy and robustness of the adversarial training model without significantly increasing the computational cost. ### Specific Problem Description 1. **Decline in Generalization Performance in Adversarial Training**: - Although adversarial training can improve the robustness of the model against adversarial samples, it often reduces the generalization performance of the model on standard data, that is, there is a "trade - off between robustness and accuracy". - The specific manifestation of this phenomenon is that after adversarial training, the accuracy of the model on the clean test set will decrease to some extent. 2. **Limitations of Existing Methods**: - Previous studies have explained this phenomenon from different perspectives, such as bias introduction, insufficient data volume, local Lipschitz property, etc., but most of these studies have focused on a single factor and failed to provide a unified theoretical explanation. ### Solutions in the Paper 1. **Introduction of the Fisher - Rao Norm**: - The Fisher - Rao norm is a geometrically invariant complexity measure, which is suitable for multi - layer perceptron (MLP) models. - The author establishes the upper and lower bounds of the Rademacher complexity of the cross - entropy loss through the Fisher - Rao norm and discovers a variable Γce related to the model width and the adversarial training trade - off factor, which is closely related to the generalization gap. 2. **Proposing Logit - Oriented Adversarial Training (LOAT)**: - LOAT combines two regularization strategies: standard logit - oriented regularization and adaptive adversarial logit - pairing strategy. - Adjust the regularization direction at the beginning and end of training respectively to effectively alleviate the trade - off problem between robustness and accuracy while minimizing the computational cost. ### Main Contributions - **Theoretical Analysis**: Through the Fisher - Rao norm, the author provides the upper and lower bounds of the Rademacher complexity of the cross - entropy loss and reveals the relationship between the variable Γce and the generalization gap. - **Experimental Verification**: Extensive experiments show that LOAT can significantly improve the performance of existing adversarial training algorithms (such as PGD - AT, TRADES, etc.), with very little increase in computational cost. Through these methods, the author has successfully provided a unified and effective solution to the problem of the decline in generalization performance in adversarial training.