SemiSAM: Enhancing Semi-Supervised Medical Image Segmentation via SAM-Assisted Consistency Regularization

Yichi Zhang,Jin Yang,Yuchen Liu,Yuan Cheng,Yuan Qi
2024-10-23
Abstract:Semi-supervised learning has attracted much attention due to its less dependence on acquiring abundant annotations from experts compared to fully supervised methods, which is especially important for medical image segmentation which typically requires intensive pixel/voxel-wise labeling by domain experts. Although semi-supervised methods can improve the performance by utilizing unlabeled data, there are still gaps between fully supervised methods under extremely limited annotation scenarios. In this paper, we propose a simple yet efficient strategy to explore the usage of the Segment Anything Model (SAM) for enhancing semi-supervised medical image segmentation. Concretely, the segmentation model trained with domain knowledge provides information for localization and generating input prompts to the SAM. Then the generated pseudo-labels of SAM are utilized as additional supervision to assist in the learning procedure of the semi-supervised framework. Extensive experiments demonstrate that SemiSAM significantly improves the performance of existing semi-supervised frameworks when only one or a few labeled images are available and shows strong efficiency as a plug-and-play strategy for semi-supervised medical image segmentation.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to improve the performance of semi - supervised learning methods in the task of medical image segmentation when the labeled data is extremely limited. Specifically, medical image segmentation usually requires a large amount of high - quality labeled data to train the model, but obtaining such data is very difficult and costly. Especially in the field of medical imaging, only experts can provide reliable and accurate labels. Although existing semi - supervised learning methods can improve performance by using unlabeled data, in the case of extremely limited labeled data, these methods still have a large gap compared with fully - supervised methods. To solve this problem, the paper proposes a new strategy - SemiSAM, which enhances the semi - supervised medical image segmentation framework by combining with the Segment Anything Model (SAM). Specifically, SemiSAM uses the pre - trained SAM model to generate pseudo - labels and uses them as an additional supervision signal to help the model better utilize unlabeled data during the training process. Experimental results show that SemiSAM significantly improves the performance of existing semi - supervised frameworks with only a small amount of labeled data. ### Main contributions 1. **Introducing SAM as an additional supervision signal**: SemiSAM utilizes the strong generalization ability of SAM to enhance the semi - supervised learning framework by generating pseudo - labels, thereby improving the model performance when the labeled data is extremely limited. 2. **Uncertainty - aware strategy**: In order to reduce the impact of noise cues generated by rough segmentation on the model performance, SemiSAM adopts an uncertainty - aware strategy, selects low - uncertainty regions to generate cue points, and ensures more reliable guidance. 3. **Extensive experimental verification**: Experimental results on the Left Atrium (LA) dataset show that SemiSAM significantly improves the segmentation performance, especially in terms of the Dice similarity coefficient, with only 1 to 4 labeled images. ### Method overview 1. **Semi - supervised framework**: SemiSAM is based on classical semi - supervised learning frameworks, such as the Mean Teacher (MT) framework. This framework contains two main parts: the main branch is used to generate the main segmentation output, and the consistency branch is used to generate additional segmentation outputs. 2. **SAM - assisted consistency regularization**: SemiSAM introduces an additional SAM - assisted supervision branch, which uses the segmentation output generated by the main branch as an input cue to generate pseudo - labels. Then, by calculating the consistency loss \( L_{\text{sam}} \) between the main branch output and the pseudo - labels as an additional supervision signal to assist the training process of the model. 3. **Optimization objective**: The optimization objective of SemiSAM is to train the network by minimizing the combination of the supervised segmentation loss \( L_{\text{sup}} \), the unsupervised consistency loss \( L_{\text{con}} \) and the SAM consistency loss \( L_{\text{sam}} \): \[ \min_{\theta} L_{\text{sup}}(f_{\theta}(X_i), Y_i) + \lambda_c L_{\text{con}}(f_{\theta}(X_j), f_{\theta'}(X_j)) + \lambda_s L_{\text{con}}(f_{\theta}(X_j), F_{\Theta}(X_j)) \] where \(\theta\), \(\theta'\) and \(\Theta\) represent the weights of the student model, the teacher model and SAM respectively. ### Experimental results - **Performance improvement**: With only 1, 2 and 4 labeled images, SemiSAM increases the Dice similarity coefficient by 10.78%, 11.29% and 8.02% respectively. - **Comparison with other methods**: SemiSAM outperforms other semi - supervised methods in multiple evaluation metrics, especially in the case of extremely limited labeled data. In conclusion, this paper effectively solves the problem by introducing the SAM - assisted consistency regularization strategy.