Exploring Frequencies via Feature Mixing and Meta-Learning for Improving Adversarial Transferability

Juanjuan Weng,Zhiming Luo,Shaozi Li
2024-05-06
Abstract:Recent studies have shown that Deep Neural Networks (DNNs) are susceptible to adversarial attacks, with frequency-domain analysis underscoring the significance of high-frequency components in influencing model predictions. Conversely, targeting low-frequency components has been effective in enhancing attack transferability on black-box models. In this study, we introduce a frequency decomposition-based feature mixing method to exploit these frequency characteristics in both clean and adversarial samples. Our findings suggest that incorporating features of clean samples into adversarial features extracted from adversarial examples is more effective in attacking normally-trained models, while combining clean features with the adversarial features extracted from low-frequency parts decomposed from the adversarial samples yields better results in attacking defense models. However, a conflict issue arises when these two mixing approaches are employed simultaneously. To tackle the issue, we propose a cross-frequency meta-optimization approach comprising the meta-train step, meta-test step, and final update. In the meta-train step, we leverage the low-frequency components of adversarial samples to boost the transferability of attacks against defense models. Meanwhile, in the meta-test step, we utilize adversarial samples to stabilize gradients, thereby enhancing the attack's transferability against normally trained models. For the final update, we update the adversarial sample based on the gradients obtained from both meta-train and meta-test steps. Our proposed method is evaluated through extensive experiments on the ImageNet-Compatible dataset, affirming its effectiveness in improving the transferability of attacks on both normally-trained CNNs and defense models.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is to improve the transferability of adversarial examples between different models, especially for normally - trained CNN models and defense models. Specifically: 1. **Transferability of adversarial examples**: Existing research has shown that deep neural networks (DNNs) are vulnerable to adversarial attacks, and the transferability of these attacks between different models is limited. Especially in black - box models, the transferability of attacks is one of the key issues. 2. **Application of frequency - domain analysis**: Through frequency - domain analysis, researchers have found that high - frequency components have an important impact on model prediction, while low - frequency components are more effective in enhancing the transferability of attacks. Therefore, how to use these frequency characteristics to improve the transferability of adversarial examples has become an important research direction. 3. **Feature mixing and meta - learning**: In order to use these frequency characteristics more effectively, the paper proposes a feature - mixing method based on frequency decomposition (feature mixing), and a cross - frequency meta - optimization method (cross - frequency meta - optimization). These methods aim to solve the conflict problems that occur when using two feature - mixing methods (AFM and LF - AFM) simultaneously, and improve the transferability of adversarial examples when attacking normally - trained models and defense models. ### Main contributions - **Introducing a feature - mixing method based on frequency decomposition**: By decomposing the low - frequency and high - frequency parts of the input image and mixing these parts with the features of the adversarial examples, the transferability of the adversarial examples is improved. - **Proposing a cross - frequency meta - optimization framework**: Through meta - training and meta - testing steps, the problem of inconsistent effects of the feature - mixing method on different models is solved, thereby enhancing the overall transferability of the adversarial examples. - **Experimental verification**: Extensive experiments on the ImageNet - Compatible dataset show that the proposed method exhibits higher transferability when attacking both normally - trained models and defense models, especially when attacking defense models. ### Solutions - **Feature mixing**: - **Low - frequency adversarial feature mixing (LF - AFM)**: Mix the low - frequency and high - frequency features of clean samples with the features extracted from the low - frequency part of the adversarial samples. - **Adversarial feature mixing (AFM)**: Mix the low - frequency and high - frequency features of clean samples with the features extracted from the adversarial samples. - **Cross - frequency meta - optimization**: - **Meta - training step**: Use low - frequency feature mixing (LF - AFM) to generate temporary adversarial samples and enhance the transferability of attacks on defense models. - **Meta - testing step**: Use adversarial feature mixing (AFM) to stabilize the gradient and enhance the transferability of attacks on normally - trained models. - **Final update**: Combine the gradients in the meta - training and meta - testing steps to update the adversarial samples. Through these methods, the paper successfully improves the transferability of adversarial examples between different models, especially when attacking defense models.