Exploring Deeper! Segment Anything Model with Depth Perception for Camouflaged Object Detection

Zhenni Yu,Xiaoqin Zhang,Li Zhao,Yi Bin,Guobao Xiao

2024-07-17

Abstract:This paper introduces a new Segment Anything Model with Depth Perception (DSAM) for Camouflaged Object Detection (COD). DSAM exploits the zero-shot capability of SAM to realize precise segmentation in the RGB-D domain. It consists of the Prompt-Deeper Module and the Finer Module. The Prompt-Deeper Module utilizes knowledge distillation and the Bias Correction Module to achieve the interaction between RGB features and depth features, especially using depth features to correct erroneous parts in RGB features. Then, the interacted features are combined with the box prompt in SAM to create a prompt with depth perception. The Finer Module explores the possibility of accurately segmenting highly camouflaged targets from a depth perspective. It uncovers depth cues in areas missed by SAM through mask reversion, self-filtering, and self-attention operations, compensating for its defects in the COD domain. DSAM represents the first step towards the SAM-based RGB-D COD model. It maximizes the utilization of depth features while synergizing with RGB features to achieve multimodal complementarity, thereby overcoming the segmentation limitations of SAM and improving its accuracy in COD. Experimental results on COD benchmarks demonstrate that DSAM achieves excellent segmentation performance and reaches the state-of-the-art (SOTA) on COD benchmarks with less consumption of training resources. The code will be available at <a class="link-external link-https" href="https://github.com/guobaoxiao/DSAM" rel="external noopener nofollow">this https URL</a>.

Computer Vision and Pattern Recognition

What problem does this paper attempt to address?

The problem that this paper attempts to solve is that in Camouflaged Object Detection (COD), the existing Segment Anything Model (SAM) has poor segmentation performance in camouflaged areas due to the high similarity between camouflaged objects and the background. Specifically, SAM mainly performs segmentation based on RGB images. When dealing with highly camouflaged objects, it cannot effectively extract semantic and structural information, thus affecting the accuracy of segmentation. To solve this problem, the author proposes a new model based on SAM - the Segment Anything Model with Depth - awareness (DSAM), aiming to improve the segmentation performance of highly camouflaged objects by introducing depth information. DSAM achieves this goal through two modules: 1. **Prompt - Deeper Module (PDM)**: This module utilizes knowledge distillation and the Bias Correction Module (BCM) to realize the interaction between RGB features and depth features, especially using depth features to correct the wrong parts in RGB features. Then, these interacted features are combined with the box prompt in SAM to generate prompts with depth - awareness. 2. **Finer Module (FM)**: This module explores the possibility of accurately segmenting highly camouflaged objects from a depth perspective. Through mask reversion, self - filtering, and self - attention operations, FM can discover the depth cues missed by SAM and compensate for its defects in the COD field. Through the synergy of these two modules, DSAM maximally utilizes depth features and is complementary to RGB features, thereby overcoming the segmentation limitations of SAM in the COD field and improving its accuracy in COD tasks. Experimental results show that DSAM has achieved excellent segmentation performance on multiple COD benchmark datasets and has reached the State - of - the - Art (SOTA) level, while consuming fewer training resources.

Exploring Deeper! Segment Anything Model with Depth Perception for Camouflaged Object Detection

Can SAM Segment Anything? When SAM Meets Camouflaged Object Detection

SAM-COD: SAM-guided Unified Framework for Weakly-Supervised Camouflaged Object Detection

Evaluating SAM2's Role in Camouflaged Object Detection: From SAM to SAM2

SAM-Adapter: Adapting Segment Anything in Underperformed Scenes

SAM Fails to Segment Anything? – SAM-Adapter: Adapting SAM in Underperformed Scenes: Camouflage, Shadow, Medical Image Segmentation, and More

Towards Deeper Understanding of Camouflaged Object Detection

Exploring Depth Contribution for Camouflaged Object Detection

COMPrompter: reconceptualized segment anything model with multiprompt network for camouflaged object detection

Multi-Scale and Detail-Enhanced Segment Anything Model for Salient Object Detection

Depth Awakens: A Depth-perceptual Attention Fusion Network for RGB-D Camouflaged Object Detection

Go Closer to See Better: Camouflaged Object Detection Via Object Area Amplification and Figure-Ground Conversion

Adapting Segment Anything Model to Multi-modal Salient Object Detection with Semantic Feature Fusion Guidance

When SAM2 Meets Video Camouflaged Object Segmentation: A Comprehensive Evaluation and Adaptation

Nowhere to Disguise: Spot Camouflaged Objects Via Saliency Attribute Transfer

SAM-PM: Enhancing Video Camouflaged Object Detection using Spatio-Temporal Attention

AM-SAM: Automated Prompting and Mask Calibration for Segment Anything Model

A systematic review of image-level camouflaged object detection with deep learning

FocSAM: Delving Deeply into Focused Objects in Segmenting Anything

Weighted Dense Semantic Aggregation and Explicit Boundary Modeling for Camouflaged Object Detection