TS-SAM: Fine-Tuning Segment-Anything Model for Downstream Tasks

Yang Yu,Chen Xu,Kai Wang

2024-08-04

Abstract:Adapter based fine-tuning has been studied for improving the performance of SAM on downstream tasks. However, there is still a significant performance gap between fine-tuned SAMs and domain-specific models. To reduce the gap, we propose Two-Stream SAM (TS-SAM). On the one hand, inspired by the side network in Parameter-Efficient Fine-Tuning (PEFT), we designed a lightweight Convolutional Side Adapter (CSA), which integrates the powerful features from SAM into side network training for comprehensive feature fusion. On the other hand, in line with the characteristics of segmentation tasks, we designed Multi-scale Refinement Module (MRM) and Feature Fusion Decoder (FFD) to keep both the detailed and semantic features. Extensive experiments on ten public datasets from three tasks demonstrate that TS-SAM not only significantly outperforms the recently proposed SAM-Adapter and SSOM, but achieves competitive performance with the SOTA domain-specific models. Our code is available at: <a class="link-external link-https" href="https://github.com/maoyangou147/TS-SAM" rel="external noopener nofollow">this https URL</a>.

Computer Vision and Pattern Recognition

What problem does this paper attempt to address?

The main goal of this paper is to improve the performance of the Segment-Anything Model (SAM) in downstream tasks, particularly for three challenging downstream tasks: Camouflaged Object Detection (COD), Shadow Detection, and Salient Object Detection (SOD). To address the poor performance of SAM in these tasks, the authors propose the Two-Stream SAM (TS-SAM) method. The key contributions of TS-SAM include: 1. **Introducing side networks into SAM fine-tuning for the first time**: By designing a lightweight Convolutional Side Adapter (CSA), it can effectively extract features from the SAM encoder and adapt them to different downstream tasks. 2. **Multi-Scale Refinement Module (MRM) and Feature Fusion Decoder (FFD) tailored for segmentation tasks**: These modules can capture detailed features in high-resolution images and fully integrate these features during decoding, resulting in more precise segmentation results. 3. **Extensive experimental validation**: Experiments on 10 public datasets show that TS-SAM not only significantly outperforms recently proposed methods such as SAM-Adapter and SSOM but also competes in performance with state-of-the-art domain-specific models designed for each task. Through these technical means, TS-SAM can effectively enhance the performance of SAM in various downstream tasks while maintaining a lightweight nature.

TS-SAM: Fine-Tuning Segment-Anything Model for Downstream Tasks

SAM-Adapter: Adapting Segment Anything in Underperformed Scenes

Integrating Spatial Prior Adapter for Enhancing SAM Performance in Medical Image Segmentation

SAM Fails to Segment Anything? – SAM-Adapter: Adapting SAM in Underperformed Scenes: Camouflage, Shadow, Medical Image Segmentation, and More

SAMP: Adapting Segment Anything Model for Pose Estimation

SU-SAM: A Simple Unified Framework for Adapting Segment Anything Model in Underperformed Scenes

SAM2-Adapter: Evaluating & Adapting Segment Anything 2 in Downstream Tasks: Camouflage, Shadow, Medical Image Segmentation, and More

Continual Learning for Segment Anything Model Adaptation

TinySAM: Pushing the Envelope for Efficient Segment Anything Model

ClassWise-SAM-Adapter: Parameter Efficient Fine-tuning Adapts Segment Anything to SAR Domain for Semantic Segmentation

ASAM: Boosting Segment Anything Model with Adversarial Tuning

Medical SAM Adapter: Adapting Segment Anything Model for Medical Image Segmentation

Deep Instruction Tuning for Segment Anything Model

Task-Aware Low-Rank Adaptation of Segment Anything Model

RefSAM: Efficiently Adapting Segmenting Anything Model for Referring Video Object Segmentation

SAM-PARSER: Fine-tuning SAM Efficiently by Parameter Space Reconstruction

MeSAM: Multiscale Enhanced Segment Anything Model for Optical Remote Sensing Images

WebSAM-Adapter: Adapting Segment Anything Model for Web Page Segmentation

Tuning a SAM-Based Model with Multi-Cognitive Visual Adapter to Remote Sensing Instance Segmentation

Multi-Scale and Detail-Enhanced Segment Anything Model for Salient Object Detection

Lite-SAM Is Actually What You Need for Segment Everything