Intrapartum Ultrasound Image Segmentation of Pubic Symphysis and Fetal Head Using Dual Student-Teacher Framework with CNN-ViT Collaborative Learning

Jianmei Jiang,Huijin Wang,Jieyun Bai,Shun Long,Shuangping Chen,Victor M. Campello,Karim Lekadir
2024-09-11
Abstract:The segmentation of the pubic symphysis and fetal head (PSFH) constitutes a pivotal step in monitoring labor progression and identifying potential delivery complications. Despite the advances in deep learning, the lack of annotated medical images hinders the training of segmentation. Traditional semi-supervised learning approaches primarily utilize a unified network model based on Convolutional Neural Networks (CNNs) and apply consistency regularization to mitigate the reliance on extensive annotated data. However, these methods often fall short in capturing the discriminative features of unlabeled data and in delineating the long-range dependencies inherent in the ambiguous boundaries of PSFH within ultrasound images. To address these limitations, we introduce a novel framework, the Dual-Student and Teacher Combining CNN and Transformer (DSTCT), which synergistically integrates the capabilities of CNNs and Transformers. Our framework comprises a Vision Transformer (ViT) as the teacher and two student mod ls one ViT and one CNN. This dual-student setup enables mutual supervision through the generation of both hard and soft pseudo-labels, with the consistency in their predictions being refined by minimizing the classifier determinacy discrepancy. The teacher model further reinforces learning within this architecture through the imposition of consistency regularization constraints. To augment the generalization abilities of our approach, we employ a blend of data and model perturbation techniques. Comprehensive evaluations on the benchmark dataset of the PSFH Segmentation Grand Challenge at MICCAI 2023 demonstrate our DSTCT framework outperformed ten contemporary semi-supervised segmentation methods. Code available at <a class="link-external link-https" href="https://github.com/jjm1589/DSTCT" rel="external noopener nofollow">this https URL</a>.
Computer Vision and Pattern Recognition,Artificial Intelligence
What problem does this paper attempt to address?
The paper aims to address the problem of segmenting the pubic symphysis and fetal head (PSFH) from ultrasound images during childbirth. Despite significant advancements in medical image segmentation achieved by deep learning techniques, particularly Convolutional Neural Networks (CNN) and Transformer models, there remains a challenge in clinical applications due to the scarcity of large annotated datasets. Annotating ultrasound images requires substantial time and expertise, making semi-supervised learning methods particularly important. However, existing semi-supervised learning methods typically rely on unified network models based on CNNs and perform inadequately in capturing discriminative features from unlabeled data and depicting long-range dependencies in fuzzy boundary regions of ultrasound images. To address these issues, the researchers propose a new framework—Dual Student Teacher combining CNN and Transformer (DSTCT). This framework leverages the strengths of both CNN and Transformer by using dual student models (one based on ViT and the other on CNN) for cross-supervision and generating hard pseudo-labels to expand the training dataset. Additionally, a soft pseudo-label consistency learning strategy is proposed to reduce label noise and promote entropy minimization. By minimizing Classifier Determinacy Discrepancy (CDD) and Consistency Regularization (CR), the model's generalization ability is further enhanced. Experimental results demonstrate that the DSTCT framework outperforms ten current semi-supervised segmentation methods on the MICCAI 2023 PSFH segmentation challenge benchmark dataset.