Towards Segment Anything Model (SAM) for Medical Image Segmentation: A Survey

Yichi Zhang,Rushi Jiao
2023-08-11
Abstract:Due to the flexibility of prompting, foundation models have become the dominant force in the domains of natural language processing and image generation. With the recent introduction of the Segment Anything Model (SAM), the prompt-driven paradigm has entered the realm of image segmentation, bringing with a range of previously unexplored capabilities. However, it remains unclear whether it can be applicable to medical image segmentation due to the significant differences between natural images and medical <a class="link-external link-http" href="http://images.In" rel="external noopener nofollow">this http URL</a> this work, we summarize recent efforts to extend the success of SAM to medical image segmentation tasks, including both empirical benchmarking and methodological adaptations, and discuss potential future directions for SAM in medical image segmentation. Although directly applying SAM to medical image segmentation cannot obtain satisfying performance on multi-modal and multi-target medical datasets, many insights are drawn to guide future research to develop foundation models for medical image analysis. To facilitate future research, we maintain an active repository that contains up-to-date paper list and open-source project summary at <a class="link-external link-https" href="https://github.com/YichiZhang98/SAM4MIS" rel="external noopener nofollow">this https URL</a>.
Image and Video Processing,Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The paper primarily explores how to apply the Segment Anything Model (SAM) to the field of medical image segmentation and summarizes recent research progress in this area. As a foundational model, SAM has demonstrated powerful performance in natural image processing, particularly in image segmentation tasks. However, there are significant differences between medical images and natural images, including structural complexity, lower contrast, and variability among observers. These differences pose challenges for directly applying SAM to medical image segmentation. The paper first introduces the basics of SAM and its workflow, then summarizes two main research directions: 1. **Performance of SAM in Medical Image Segmentation**: Researchers have evaluated SAM's performance in various medical image segmentation tasks. Although SAM performs well on certain specific objects and modalities, achieving levels comparable to existing methods, its performance is unsatisfactory when dealing with targets that have fuzzy boundaries, low contrast, irregular shapes, or small sizes. Directly applying SAM to medical image segmentation often fails to achieve satisfactory performance. 2. **How to Better Adapt SAM for Medical Image Segmentation**: To improve SAM's performance in medical image segmentation tasks, researchers have explored various methods, including: - **Fine-tuning SAM**: By fine-tuning parts of SAM's components on specific medical datasets, its performance can be significantly improved. For example, fine-tuning only the mask decoder part of SAM or adopting parameter-efficient fine-tuning strategies. - **Extending SAM's Application Scope**: Some studies aim to simplify the use of SAM in medical images, such as integrating it into commonly used medical image viewers or developing fully automated solutions that do not require manual prompts. - **Enhancing SAM's Robustness Against Different Prompts**: Methods have been proposed to reduce the impact of erroneous prompts on the final segmentation results, such as decoupling SAM's mask decoder to make it more robust. - **Using SAM for Input Enhancement**: Even if SAM's direct application to medical image segmentation is suboptimal, the segmentation masks and features it generates can still be used to enhance the original image input, thereby improving the performance of subsequent segmentation models. In summary, although SAM's current performance in the field of medical image segmentation does not fully meet the requirements, it shows the potential for building a general segmentation foundational model. Future research directions may include establishing larger-scale medical image datasets, integrating multimodal information, and developing foundational models more suitable for the medical field.