Versatile Defense Against Adversarial Attacks on Image Recognition

Haibo Zhang,Zhihua Yao,Kouichi Sakurai
2024-03-13
Abstract:Adversarial attacks present a significant security risk to image recognition tasks. Defending against these attacks in a real-life setting can be compared to the way antivirus software works, with a key consideration being how well the defense can adapt to new and evolving attacks. Another important factor is the resources involved in terms of time and cost for training defense models and updating the model database. Training many models that are specific to each type of attack can be time-consuming and expensive. Ideally, we should be able to train one single model that can handle a wide range of attacks. It appears that a defense method based on image-to-image translation may be capable of this. The proposed versatile defense approach in this paper only requires training one model to effectively resist various unknown adversarial attacks. The trained model has successfully improved the classification accuracy from nearly zero to an average of 86%, performing better than other defense methods proposed in prior studies. When facing the PGD attack and the MI-FGSM attack, versatile defense model even outperforms the attack-specific models trained based on these two attacks. The robustness check also shows that our versatile defense model performs stably regardless with the attack strength.
Computer Vision and Pattern Recognition,Image and Video Processing
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: **How to construct a general - purpose defense mechanism to resist multiple adversarial attacks against image recognition tasks while reducing training and maintenance costs?** Specifically, convolutional neural networks (CNNs) perform excellently in image recognition tasks, but they are very sensitive to adversarial attacks. These adversarial attacks cause the model to misclassify through tiny and almost imperceptible perturbations. This vulnerability is especially dangerous in critical applications such as face recognition and autonomous driving, so it is necessary to design powerful defense mechanisms to deal with these attacks. Traditional defense methods are usually trained for specific types of attacks, which not only consumes a great deal of time and resources but also is difficult to adapt to new and constantly evolving attacks. Ideally, it should be possible to train a single general - purpose model so that it can effectively resist multiple unknown adversarial attacks. This paper proposes a method based on image - to - image translation, aiming to achieve this goal. ### Main problem summary: 1. **Improve the generalization ability of the model**: Ensure that the defense model can not only resist known attack types but also remain effective against unknown attacks. 2. **Reduce resource consumption**: Reduce the time and cost required to train multiple defense models for specific attacks by training a single general - purpose model. 3. **Improve classification accuracy**: Restore and improve the accuracy of image classification in the face of adversarial attacks. Through these efforts, researchers hope to develop a more efficient and more robust defense mechanism suitable for various application scenarios in the real world.