Abstract:Deep neural network-based image classifications are vulnerable to adversarial perturbations. The image classifications can be easily fooled by adding artificial small and imperceptible perturbations to input images. As one of the most effective defense strategies, adversarial training was proposed to address the vulnerability of classification models, where the adversarial examples are created and injected into training data during training. The attack and defense of classification models have been intensively studied in past years. Semantic segmentation, as an extension of classifications, has also received great attention recently. Recent work shows a large number of attack iterations are required to create effective adversarial examples to fool segmentation models. The observation makes both robustness evaluation and adversarial training on segmentation models challenging. In this work, we propose an effective and efficient segmentation attack method, dubbed SegPGD. Besides, we provide a convergence analysis to show the proposed SegPGD can create more effective adversarial examples than PGD under the same number of attack iterations. Furthermore, we propose to apply our SegPGD as the underlying attack method for segmentation adversarial training. Since SegPGD can create more effective adversarial examples, the adversarial training with our SegPGD can boost the robustness of segmentation models. Our proposals are also verified with experiments on popular Segmentation model architectures and standard segmentation datasets.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is how to generate adversarial examples effectively and efficiently in semantic segmentation tasks to evaluate and improve the robustness of models. Specifically: 1. **Effectiveness and Efficiency of Adversarial Attacks**: Existing adversarial attack methods have been widely studied in classification tasks. However, in semantic segmentation tasks, since it is necessary to mislead the classification of all pixels simultaneously, generating effective adversarial examples usually requires more iteration times. This makes it difficult to conduct robustness evaluation and adversarial training in semantic segmentation tasks. 2. **Challenges of Adversarial Training**: Although adversarial training is an effective method to improve the robustness of models, in semantic segmentation tasks, generating effective adversarial examples is very time - consuming. Therefore, a more efficient method to generate adversarial examples is needed to save time and computational resources in adversarial training. To solve the above problems, the authors propose **SegPGD** (Segmentation Projected Gradient Descent), an adversarial attack method specifically for semantic segmentation tasks. By dynamically adjusting the weights of correctly - classified and mis - classified pixels in the loss function, SegPGD can generate more effective adversarial examples with the same number of attack iterations. In addition, the authors also provide a convergence analysis, prove the effectiveness of SegPGD, and show how to apply SegPGD to adversarial training to improve the robustness of semantic segmentation models. ### Main Contributions 1. **Proposing SegPGD**: Based on the differences between classification tasks and segmentation tasks, an effective and efficient segmentation adversarial attack method SegPGD is proposed, and its generalized form SegFGSM in single - step attacks is shown. 2. **Convergence Analysis**: A convergence analysis is provided, proving that SegPGD can generate more effective adversarial examples than PGD with the same number of attack iterations. 3. **Application to Adversarial Training**: Using SegPGD as the basic attack method for adversarial training significantly improves the robustness of segmentation models. 4. **Experimental Verification**: Experiments are carried out on multiple popular segmentation model architectures (such as PSPNet and DeepLabV3) and standard segmentation datasets (such as PASCAL VOC and Cityscapes) to verify the effectiveness of the proposed method. ### Experimental Results - **Quantitative Evaluation**: Under different numbers of attack iterations, the adversarial examples generated by SegPGD can reduce the mIoU metric of the model more quickly, showing higher effectiveness and efficiency. - **Qualitative Evaluation**: By visualizing the generated adversarial examples and their model prediction results, it is further verified that the adversarial examples generated by SegPGD are more effective than those generated by PGD. - **Comparison with Existing Methods**: SegPGD is superior to other segmentation adversarial attack methods, such as DAG and MLAttack, in both efficiency and effectiveness. In conclusion, this paper effectively solves the challenges of generating adversarial examples in semantic segmentation tasks by proposing the SegPGD method, providing a new solution for evaluating and improving the robustness of segmentation models.

SegPGD: An Effective and Efficient Adversarial Attack for Evaluating and Boosting Segmentation Robustness

TranSegPGD: Improving Transferability of Adversarial Examples on Semantic Segmentation

Multiclass ASMA vs Targeted PGD Attack in Image Segmentation

GenSeg: on Generating Unified Adversary for Segmentation

CosPGD: an efficient white-box adversarial attack for pixel-wise prediction tasks

Adversarial Examples for Semantic Segmentation and Object Detection

Low-Rank Adversarial PGD Attack

Efficient Two-Step Adversarial Defense for Deep Neural Networks

Robust Image Classification: Defensive Strategies against FGSM and PGD Adversarial Attacks

Perturbation-Seeking Generative Adversarial Networks: A Defense Framework for Remote Sensing Image Scene Classification

FCGSM: Fast Conjugate Gradient Sign Method for Adversarial Attack on Image Classification

Evaluating the Adversarial Robustness of Semantic Segmentation: Trying Harder Pays Off

Enhancing Adversarial Text Attacks on BERT Models with Projected Gradient Descent

Dynamic Divide-and-Conquer Adversarial Training for Robust Semantic Segmentation

A Certified Radius-Guided Attack Framework to Image Segmentation Models

Guidance Through Surrogate: Towards a Generic Diagnostic Attack

Towards sustainable adversarial training with successive perturbation generation

DPG: a model to build feature subspace against adversarial patch attack

Universal Adversarial Perturbations Against Semantic Image Segmentation

Stealthy Adversarial Examples for Semantic Segmentation in Remote Sensing

Adversarial Attacks on Video Object Segmentation with Hard Region Discovery