Abstract:Foundation models like the segment anything model require high-quality manual prompts for medical image segmentation, which is time-consuming and requires expertise. SAM and its variants often fail to segment structures in ultrasound (US) images due to domain shift. We propose Sam2Rad, a prompt learning approach to adapt SAM and its variants for US bone segmentation without human prompts. It introduces a prompt predictor network (PPN) with a cross-attention module to predict prompt embeddings from image encoder features. PPN outputs bounding box and mask prompts, and 256-dimensional embeddings for regions of interest. The framework allows optional manual prompting and can be trained end-to-end using parameter-efficient fine-tuning (PEFT). Sam2Rad was tested on 3 musculoskeletal US datasets: wrist (3822 images), rotator cuff (1605 images), and hip (4849 images). It improved performance across all datasets without manual prompts, increasing Dice scores by 2-7% for hip/wrist and up to 33% for shoulder data. Sam2Rad can be trained with as few as 10 labeled images and is compatible with any SAM architecture for automatic segmentation.

What problem does this paper attempt to address?

### Problems the Paper Attempts to Solve This paper aims to address the issues faced when using foundational models (such as the Segment Anything Model, SAM) in medical image segmentation. Specifically: 1. **The need for high-quality manual prompts**: Existing medical image segmentation methods require high-quality manual prompts, which are typically generated by individuals with medical expertise, making the process time-consuming and costly. 2. **Poor cross-domain adaptability**: Even with sparse prompts (such as boxes, points, or text) or dense prompts (such as masks), SAM and its variants (like MedSAM) perform poorly in segmenting bones in ultrasound images (US), primarily due to significant domain shift issues. 3. **Dependence on manual prompts**: Current methods still rely on manual prompts for image segmentation, which is often challenging in practical applications, especially in the absence of medical expertise. To address these issues, the authors propose a new prompt learning method—Sam2Rad, which can automatically generate prompts, thereby completing medical image segmentation tasks without human intervention. Specifically, Sam2Rad introduces a Prompt Predictor Network (PPN) to predict prompt embeddings, generating bounding boxes and mask prompts directly from features extracted by the image encoder. Additionally, this framework supports optional manual prompts, which can be combined with learned prompts and input into the mask decoder. Through Parameter-Efficient Fine-Tuning (PEFT), the PPN and mask decoder can be trained end-to-end. Experimental results show that Sam2Rad significantly improves segmentation performance on multiple ultrasound datasets, particularly on shoulder ultrasound images, where the Dice score increased from 49% to 82%.

Sam2Rad: A Segmentation Model for Medical Images with Learnable Prompts

Segment Anything Model for Medical Image Analysis: an Experimental Study

US-SAM:An Automatic Prompt Sam for Ultrasound Image

Segmentation by registration-enabled SAM prompt engineering using five reference images

P2SAM: Probabilistically Prompted SAMs Are Efficient Segmentator for Ambiguous Medical Images

Med-PerSAM: One-Shot Visual Prompt Tuning for Personalized Segment Anything Model in Medical Domain

Beyond Adapting SAM: Towards End-to-End Ultrasound Image Segmentation via Auto Prompting

False Negative/Positive Control for SAM on Noisy Medical Images

K-SAM: A Prompting Method Using Pretrained U-Net to Improve Zero Shot Performance of SAM on Lung Segmentation in CXR Images

Medical SAM 2: Segment medical images as video via Segment Anything Model 2

AutoProSAM: Automated Prompting SAM for 3D Multi-Organ Segmentation

Unleashing the Potential of SAM for Medical Adaptation via Hierarchical Decoding

SAM.MD: Zero-shot medical image segmentation capabilities of the Segment Anything Model

How Segment Anything Model (SAM) Boost Medical Image Segmentation?

Segment anything model 2: an application to 2D and 3D medical images

ESP-MedSAM: Efficient Self-Prompting SAM for Universal Domain-Generalized Medical Image Segmentation

S-SAM: SVD-based Fine-Tuning of Segment Anything Model for Medical Image Segmentation

Is SAM 2 Better than SAM in Medical Image Segmentation?

RevSAM2: Prompt SAM2 for Medical Image Segmentation via Reverse-Propagation without Fine-tuning

Customized Segment Anything Model for Medical Image Segmentation

Automating MedSAM by Learning Prompts with Weak Few-Shot Supervision