DarkSAM: Fooling Segment Anything Model to Segment Nothing

Ziqi Zhou,Yufei Song,Minghui Li,Shengshan Hu,Xianlong Wang,Leo Yu Zhang,Dezhong Yao,Hai Jin
2024-09-26
Abstract:Segment Anything Model (SAM) has recently gained much attention for its outstanding generalization to unseen data and tasks. Despite its promising prospect, the vulnerabilities of SAM, especially to universal adversarial perturbation (UAP) have not been thoroughly investigated yet. In this paper, we propose DarkSAM, the first prompt-free universal attack framework against SAM, including a semantic decoupling-based spatial attack and a texture distortion-based frequency attack. We first divide the output of SAM into foreground and background. Then, we design a shadow target strategy to obtain the semantic blueprint of the image as the attack target. DarkSAM is dedicated to fooling SAM by extracting and destroying crucial object features from images in both spatial and frequency domains. In the spatial domain, we disrupt the semantics of both the foreground and background in the image to confuse SAM. In the frequency domain, we further enhance the attack effectiveness by distorting the high-frequency components (i.e., texture information) of the image. Consequently, with a single UAP, DarkSAM renders SAM incapable of segmenting objects across diverse images with varying prompts. Experimental results on four datasets for SAM and its two variant models demonstrate the powerful attack capability and transferability of DarkSAM.
Artificial Intelligence
What problem does this paper attempt to address?
### What problem does this paper attempt to solve? This paper aims to solve the problem of the vulnerability of the Segment Anything Model (SAM) to Universal Adversarial Perturbation (UAP). Specifically, the authors propose DarkSAM, which is the first prompt - free universal adversarial attack framework for SAM and its variant models. Its goal is to make SAM unable to effectively segment various images through a single UAP, regardless of the input prompt. #### Main problem description 1. **Universal adversarial attacks on SAM have not been fully studied**: - Although SAM performs well in handling unseen data and tasks, its vulnerability to UAP has not been thoroughly studied. - Existing adversarial attacks mainly focus on classification tasks, while SAM, as a guided segmentation model, depends on input images and prompts to generate label - free masks, which makes traditional adversarial attack methods no longer applicable. 2. **How to deceive SAM with a single UAP so that it cannot segment any object**: - The input of SAM includes not only images but also prompts (such as points, boxes, etc.), and the output does not depend on pixel - level labels, which increases the difficulty of launching an attack on it. - The question raised in the paper is: Can SAM be deceived with a single UAP so that it cannot segment any object? #### Solutions of DarkSAM - **Comprehensive attack in the spatial and frequency domains**: - In the spatial domain, DarkSAM confuses SAM's decision - making by destroying the foreground and background features of the image. - In the frequency domain, DarkSAM enhances the attack effect by distorting high - frequency components (i.e., texture information) while maintaining the consistency of low - frequency components (i.e., shape information). - **Shadow Target Strategy**: - To solve the double ambiguity brought by different images and prompts, DarkSAM introduces the Shadow Target Strategy, which enhances the cross - prompt transferability of UAP by increasing the number of prompts. - Specifically, for a given input image, multiple prompts (such as points or boxes) are randomly selected, and their mask outputs are combined to form a semantic blueprint as the target of the attack. #### Experimental results - **High attack success rate and transferability**: - The experimental results show that DarkSAM has a high attack success rate and transferability on SAM and its two variant models (HQ - SAM and PerSAM) on four benchmark datasets. - Qualitative and quantitative experimental results indicate that DarkSAM can effectively deceive SAM and make it unable to correctly segment images. In conclusion, this paper reveals the vulnerability of SAM under universal adversarial attacks by proposing the DarkSAM framework and demonstrates the ability to make it ineffective through a single UAP.