Abstract:The Segment-Anything Model (SAM) is a vision foundation model for segmentation with a prompt-driven framework. SAM generates class-agnostic masks based on user-specified instance-referring prompts. However, adapting SAM for automated segmentation -- where manual input is absent -- of specific object classes often requires additional model training. We present Segment Any Class (SAC), a novel, training-free approach that task-adapts SAM for Multi-class segmentation. SAC generates Class-Region Proposals (CRP) on query images which allows us to automatically generate class-aware prompts on probable locations of class instances. CRPs are derived from elementary intra-class and inter-class feature distinctions without any additional training. Our method is versatile, accommodating any N-way K-shot configurations for the multi-class few-shot semantic segmentation (FSS) task. Unlike gradient-learning adaptation of generalist models which risk the loss of generalization and potentially suffer from catastrophic forgetting, SAC solely utilizes automated prompting and achieves superior results over state-of-the-art methods on the COCO-20i benchmark, particularly excelling in high N-way class scenarios. SAC is an interesting demonstration of a prompt-only approach to adapting foundation models for novel tasks with small, limited datasets without any modifications to the foundation model itself. This method offers interesting benefits such as intrinsic immunity to concept or feature loss and rapid, online task adaptation of foundation models.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is how to make a base model (such as Segment - Anything Model, SAM) adapt to multi - class few - shot semantic segmentation tasks without additional training. Specifically: 1. **Problem Background**: - Although base models (such as SAM) can generate category - independent masks according to user - specified instance reference prompts, additional model training is usually required when automatically segmenting specific object categories. - For multi - class few - shot semantic segmentation tasks, directly updating model weights may lead to catastrophic forgetting, that is, the learning of new tasks will overwrite the previously learned knowledge. 2. **Research Objectives**: - Propose a new method without additional training to enable the base model to automatically adapt to multi - class segmentation tasks. - Automatically generate Class - Region Proposals (CRP) to achieve the automatic prompt generation for possible class instance locations. - Verify the performance of this method on the COCO - 20i benchmark, especially in high N - way (multi - class) scenarios. 3. **Solutions**: - Propose the Segment Any Class (SAC) method, which realizes multi - class few - shot semantic segmentation without gradient learning by leveraging the base models of DINOv2 and SAM. - SAC works through the following steps: 1. **Support Feature Extraction**: Use DINOv2 to extract features from support images and construct a Category - Representative Feature Array (CRFA). 2. **Mask Prediction and Class Region Proposals**: In the inference stage, generate Class - Region Proposals (CRP) by calculating the similarity map between the query image and CRFA, and further generate a prompt set for SAM for segmentation. 4. **Innovations**: - SAC completely avoids gradient learning, ensuring that no catastrophic forgetting occurs. - By automatically generating prompt sets, SAC can quickly adapt to new tasks on small datasets. - It performs well in multi - class few - shot segmentation tasks, especially in high N - way scenarios. In conclusion, this paper aims to solve the problem of how to make the base model adapt to multi - class few - shot semantic segmentation tasks without modifying the base model weights, and proposes a new method based on automatic prompt generation.

Segment Any Class (SAC): Multi-Class Few-Shot Semantic Segmentation via Class Region Proposals

Sam-Rsp: A New Few-Shot Segmentation Method Based on Segment Anything Model and Rough Segmentation Prompts

Segment Any Cell: A SAM-based Auto-prompting Fine-tuning Framework for Nuclei Segmentation

APSeg: Auto-Prompt Network for Cross-Domain Few-Shot Semantic Segmentation

SAM-Adapter: Adapting Segment Anything in Underperformed Scenes

SAM Fails to Segment Anything? – SAM-Adapter: Adapting SAM in Underperformed Scenes: Camouflage, Shadow, Medical Image Segmentation, and More

Boosting Few-Shot Semantic Segmentation Via Segment Anything Model

Semantic-SAM: Segment and Recognize Anything at Any Granularity

AM-SAM: Automated Prompting and Mask Calibration for Segment Anything Model

SSA-Seg: Semantic and Spatial Adaptive Pixel-level Classifier for Semantic Segmentation

Self-guided Few-shot Semantic Segmentation for Remote Sensing Imagery Based on Large Vision Models

Adapting Segment Anything Model to Multi-modal Salient Object Detection with Semantic Feature Fusion Guidance

Semantic and Spatial Adaptive Pixel-level Classifier for Semantic Segmentation

Label Anything: Multi-Class Few-Shot Semantic Segmentation with Visual Prompts

RSAM-Seg: A SAM-based Approach with Prior Knowledge Integration for Remote Sensing Image Semantic Segmentation

Prototypical Metric Segment Anything Model for Data-Free Few-Shot Semantic Segmentation

Adaptive Prompt Learning with SAM for Few-shot Scanning Probe Microscope Image Segmentation

FocSAM: Delving Deeply into Focused Objects in Segmenting Anything

Effective SAM Combination for Open-Vocabulary Semantic Segmentation

AGSAM: Agent-Guided Segment Anything Model for Automatic Segmentation in Few-Shot Scenarios

Task-Specific Adaptation of Segmentation Foundation Model via Prompt Learning