MCP-MedSAM: A Powerful Lightweight Medical Segment Anything Model Trained with a Single GPU in Just One Day

Donghang Lyu,Ruochen Gao,Marius Staring
2024-12-08
Abstract:Medical image segmentation involves partitioning medical images into meaningful regions, with a focus on identifying anatomical structures or abnormalities. It has broad applications in healthcare, and deep learning methods have enabled significant advancements in automating this process. Recently, the introduction of the Segmentation Anything Model (SAM), the first foundation model for segmentation task, has prompted researchers to adapt it for the medical domain to improve performance across various tasks. However, SAM's large model size and high GPU requirements hinder its scalability and development in the medical domain. To address these challenges, research has increasingly focused on lightweight adaptations of SAM to reduce its parameter count, enabling training with limited GPU resources while maintaining competitive segmentation performance. In this work, we propose MCP-MedSAM, a powerful and lightweight medical SAM model designed to be trainable on a single GPU within one day while delivering superior segmentation performance. Our method was trained and evaluated using a large-scale challenge dataset\footnote{\url{<a class="link-external link-https" href="https://www.codabench.org/competitions/1847" rel="external noopener nofollow">this https URL</a>}\label{comp}}, compared to top-ranking methods on the challenge leaderboard, MCP-MedSAM achieved superior performance while requiring only one day of training on a single GPU. The code is publicly available at \url{<a class="link-external link-https" href="https://github.com/dong845/MCP-MedSAM" rel="external noopener nofollow">this https URL</a>}.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The problem that this paper attempts to solve is that existing medical image segmentation models (such as SAM) have problems in medical applications, such as high demand for computing resources, long training time, and difficulty in adapting to multiple imaging modalities. Specifically: 1. **Computing Resources and Training Time**: The original Segment Anything Model (SAM) model, due to its large number of parameters and high GPU requirements, limits its wide application in the medical field. Especially for research groups and academic institutions with limited computing resources, this has become a major obstacle. 2. **Multi - modal Adaptability**: Medical image segmentation tasks involve multiple imaging modalities (such as CT, MRI, ultrasound, etc.), while existing models are often optimized for specific modalities and have poor generalization ability between different modalities. 3. **Data Imbalance Problem**: The data distribution of different imaging modalities is unbalanced, resulting in poor performance of the model on certain modalities. To solve these problems, the author proposes a new method named MCP - MedSAM, aiming to achieve efficient medical image segmentation through the following improvements: - **Lightweight Architecture**: By reducing the number of model parameters, MCP - MedSAM can complete training on a single GPU in only one day, greatly reducing the demand for computing resources. - **Introduction of New Prompt Mechanisms**: Including modality prompt and content prompt to provide more relevant information and improve segmentation performance. - **Effective Data Sampling Strategy**: Adopting a modality - based data sampling strategy to alleviate the data imbalance problem and ensure that the model can have a more balanced performance on various modalities. These improvements not only improve the segmentation accuracy of the model but also make it more suitable for a wide range of medical application scenarios and easy to be adopted and developed by more researchers.