SCUNet++: Swin-UNet and CNN Bottleneck Hybrid Architecture with Multi-Fusion Dense Skip Connection for Pulmonary Embolism CT Image Segmentation

Yifei Chen,Binfeng Zou,Zhaoxin Guo,Yiyu Huang,Yifan Huang,Feiwei Qin,Qinhai Li,Changmiao Wang
2024-01-03
Abstract:Pulmonary embolism (PE) is a prevalent lung disease that can lead to right ventricular hypertrophy and failure in severe cases, ranking second in severity only to myocardial infarction and sudden death. Pulmonary artery CT angiography (CTPA) is a widely used diagnostic method for PE. However, PE detection presents challenges in clinical practice due to limitations in imaging technology. CTPA can produce noises similar to PE, making confirmation of its presence time-consuming and prone to overdiagnosis. Nevertheless, the traditional segmentation method of PE can not fully consider the hierarchical structure of features, local and global spatial features of PE CT images. In this paper, we propose an automatic PE segmentation method called SCUNet++ (Swin Conv UNet++). This method incorporates multiple fusion dense skip connections between the encoder and decoder, utilizing the Swin Transformer as the encoder. And fuses features of different scales in the decoder subnetwork to compensate for spatial information loss caused by the inevitable downsampling in Swin-UNet or other state-of-the-art methods, effectively solving the above problem. We provide a theoretical analysis of this method in detail and validate it on publicly available PE CT image datasets FUMPE and CAD-PE. The experimental results indicate that our proposed method achieved a Dice similarity coefficient (DSC) of 83.47% and a Hausdorff distance 95th percentile (HD95) of 3.83 on the FUMPE dataset, as well as a DSC of 83.42% and an HD95 of 5.10 on the CAD-PE dataset. These findings demonstrate that our method exhibits strong performance in PE segmentation tasks, potentially enhancing the accuracy of automatic segmentation of PE and providing a powerful diagnostic tool for clinical physicians. Our source code and new FUMPE dataset are available at <a class="link-external link-https" href="https://github.com/JustlfC03/SCUNet-plusplus" rel="external noopener nofollow">this https URL</a>.
Image and Video Processing,Computer Vision and Pattern Recognition,Machine Learning
What problem does this paper attempt to address?
The paper attempts to address several key issues in the segmentation of Pulmonary Embolism (PE) CT images: 1. **Limitations of traditional segmentation methods**: Existing PE segmentation methods fail to fully consider the hierarchical features of CT images, as well as local and global spatial features, leading to insufficient segmentation accuracy. 2. **Challenges in clinical diagnosis**: In clinical practice, due to the limitations of imaging technology, CT Pulmonary Angiography (CTPA) may produce noise similar to PE, making it time-consuming and prone to misdiagnosis when confirming the presence of PE. 3. **Deficiencies of existing models**: Although advanced methods like Swin-UNet perform well in learning global semantic features, they lack convolution operations during the upsampling process, resulting in insufficient extraction of local spatial features and affecting detailed segmentation results. To address the above issues, the paper proposes an automatic PE segmentation method named SCUNet++ (Swin Conv UNet++). The main innovations of this method include: - **Multi-fusion dense skip connections**: By introducing multi-fusion dense skip connections between the encoder and decoder, it effectively compensates for the loss of spatial information caused by downsampling. - **Swin Transformer as the encoder**: Utilizing the Swin Transformer module to capture global and long-range semantic information. - **CNN bottleneck module**: Introducing a CNN module in the bottleneck layer to compensate for the Swin Transformer's shortcomings in extracting local spatial features. Experimental results show that SCUNet++ achieves Dice Similarity Coefficients (DSC) of 83.47% and 83.42% on the FUMPE and CAD-PE datasets, respectively, and Hausdorff Distance 95th percentile (HD95) of 3.83 and 5.10, significantly outperforming other traditional and advanced segmentation models. This indicates that SCUNet++ has high accuracy in PE segmentation tasks and has the potential to become a powerful diagnostic tool for clinicians.