Segment Anything in Medical Images and Videos: Benchmark and Deployment

Jun Ma,Sumin Kim,Feifei Li,Mohammed Baharoon,Reza Asakereh,Hongwei Lyu,Bo Wang
2024-08-07
Abstract:Recent advances in segmentation foundation models have enabled accurate and efficient segmentation across a wide range of natural images and videos, but their utility to medical data remains unclear. In this work, we first present a comprehensive benchmarking of the Segment Anything Model 2 (SAM2) across 11 medical image modalities and videos and point out its strengths and weaknesses by comparing it to SAM1 and MedSAM. Then, we develop a transfer learning pipeline and demonstrate SAM2 can be quickly adapted to medical domain by fine-tuning. Furthermore, we implement SAM2 as a 3D slicer plugin and Gradio API for efficient 3D image and video segmentation. The code has been made publicly available at \url{<a class="link-external link-https" href="https://github.com/bowang-lab/MedSAM" rel="external noopener nofollow">this https URL</a>}.
Image and Video Processing,Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is to evaluate and improve the performance of Segment Anything Model 2 (SAM2) in medical image and video segmentation. Specifically, the researchers focus on the following aspects: 1. **Comprehensive evaluation of SAM2's performance on multiple medical modalities**: - The researchers conducted a comprehensive benchmark test of SAM2 on 11 different medical image and video modalities, including CT, MRI, PET, ultrasound (US), endoscopes, etc. - By comparing with SAM1 (the first - generation Segment Anything Model) and MedSAM (a model specifically for medical images), the advantages and disadvantages of SAM2 were pointed out. 2. **Developing a transfer learning pipeline to adapt to the medical field**: - In order to improve the performance of SAM2 in medical image segmentation, the researchers developed a transfer learning pipeline to quickly adapt to the medical field by fine - tuning SAM2. - The experimental results show that the fine - tuned SAM2 significantly improves the segmentation accuracy in the 3D abdominal organ segmentation task. 3. **Implementing the deployment of SAM2 in 3D Slicer plugins and Gradio API**: - The researchers integrated SAM2 into 3D Slicer plugins and Gradio API, so that users can conveniently use these tools for efficient 3D medical image and video segmentation. - These interfaces enable medical professionals to access and use SAM2 without writing code and provide feedback. 4. **Exploring the relationship between model size and performance**: - The study found that a larger model does not necessarily perform better in all medical image segmentation tasks. For example, in some modalities, a smaller model (such as SAM2 - Tiny) achieves the best performance instead. - This indicates that model size is not the only factor determining the success of medical image segmentation, and the specific characteristics of the training data set and the training protocol are equally important. 5. **Analyzing the differences between the general - purpose model (SAM2) and the domain - specific model (MedSAM)**: - Although SAM2 performs well in 3D medical image segmentation, it is still inferior to MedSAM in most 2D medical image modalities. - The research shows that transfer learning can improve the ability of SAM2 in medical image segmentation, but direct fine - tuning may weaken its original multi - purpose segmentation ability. Overall, this paper aims to promote the application and development of SAM2 in the field of medical image and video segmentation through comprehensive evaluation, transfer learning, and the development of deployment tools.