Retrieval-augmented Few-shot Medical Image Segmentation with Foundation Models

Lin Zhao,Xiao Chen,Eric Z. Chen,Yikang Liu,Terrence Chen,Shanhui Sun
2024-08-16
Abstract:Medical image segmentation is crucial for clinical decision-making, but the scarcity of annotated data presents significant challenges. Few-shot segmentation (FSS) methods show promise but often require retraining on the target domain and struggle to generalize across different modalities. Similarly, adapting foundation models like the Segment Anything Model (SAM) for medical imaging has limitations, including the need for finetuning and domain-specific adaptation. To address these issues, we propose a novel method that adapts DINOv2 and Segment Anything Model 2 (SAM 2) for retrieval-augmented few-shot medical image segmentation. Our approach uses DINOv2's feature as query to retrieve similar samples from limited annotated data, which are then encoded as memories and stored in memory bank. With the memory attention mechanism of SAM 2, the model leverages these memories as conditions to generate accurate segmentation of the target image. We evaluated our framework on three medical image segmentation tasks, demonstrating superior performance and generalizability across various modalities without the need for any retraining or finetuning. Overall, this method offers a practical and effective solution for few-shot medical image segmentation and holds significant potential as a valuable annotation tool in clinical applications.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the challenges in medical image segmentation due to the scarcity of labeled data. Specifically, the paper aims to: 1. **Reduce the dependence on a large amount of labeled data**: Medical image segmentation requires accurate labeling, but obtaining these labels is very time - consuming and requires the participation of professional medical personnel, so it is difficult to obtain a large amount of labeled data. 2. **Improve the generalization ability of the model in different modalities**: Existing few - shot segmentation (FSS) methods usually need to be retrained on the target domain and have limited generalization ability between different modalities (such as CT, MRI, etc.). 3. **Avoid the need for retraining or fine - tuning**: Many existing FSS methods and base models (such as Segment Anything Model, SAM) need to be fine - tuned or domain - specific adaptation when applied to medical images, which increases the complexity and cost of application. To solve these problems, the author proposes a novel method that combines DINOv2 and SAM2 for retrieval - enhanced few - shot medical image segmentation. The main features of this method include: - **Using the features generated by DINOv2 as queries**: Retrieve similar samples from the limited labeled data and encode them as memories stored in the memory bank. - **Through the memory attention mechanism of SAM2**: Use these memories as conditions to generate accurate segmentation of the target image. - **No need for retraining or fine - tuning**: This framework directly uses the pre - trained base model structure and weights, and only needs to update the samples in the retrieval database when applied in new fields. Through this method, the author hopes to achieve better segmentation performance and generalization ability on different medical imaging modalities while reducing the dependence on a large amount of labeled data. The experimental results show that this method performs excellently in multiple medical image segmentation tasks and has significant application potential.