Modality preserving U-Net for segmentation of multimodal medical images
Bingxuan Wu,Fan Zhang,Liang Xu,Shuwei Shen,Pengfei Shao,Mingzhai Sun,Peng Liu,Peng Yao,Ronald X Xu
DOI: https://doi.org/10.21037/qims-22-1367
2023-08-01
Abstract:Background: Recent advances in artificial intelligence and digital image processing have inspired the use of deep neural networks for segmentation tasks in multimodal medical imaging. Unlike natural images, multimodal medical images contain much richer information regarding different modal properties and therefore present more challenges for semantic segmentation. However, there is no report on systematic research that integrates multi-scaled and structured analysis of single-modal and multimodal medical images. Methods: We propose a deep neural network, named as Modality Preserving U-Net (MPU-Net), for modality-preserving analysis and segmentation of medical targets from multimodal medical images. The proposed MPU-Net consists of a modality preservation encoder (MPE) module that preserves the feature independency among the modalities and a modality fusion decoder (MFD) module that performs a multiscale feature fusion analysis for each modality in order to provide a rich feature representation for the final task. The effectiveness of such a single-modal preservation and multimodal fusion feature extraction approach is verified by multimodal segmentation experiments and an ablation study using brain tumor and prostate datasets from Medical Segmentation Decathlon (MSD). Results: The segmentation experiments demonstrated the superiority of MPU-Net over other methods in the segmentation tasks for multimodal medical images. In the brain tumor segmentation tasks, the Dice scores (DSCs) for the whole tumor (WT), the tumor core (TC) and the enhancing tumor (ET) regions were 89.42%, 86.92%, and 84.59%, respectively. In the meanwhile, the 95% Hausdorff distance (HD95) results were 3.530, 4.899 and 2.555, respectively. In the prostate segmentation tasks, the DSCs for the peripheral zone (PZ) and the transitional zone (TZ) of the prostate were 71.20% and 90.38%, respectively. In the meanwhile, the 95% HD95 results were 6.367 and 4.766, respectively. The ablation study showed that the combination of single-modal preservation and multimodal fusion methods improved the performance of multimodal medical image feature analysis. Conclusions: In the segmentation tasks using brain tumor and prostate datasets, the MPU-Net method has achieved the improved performance in comparison with the conventional methods, indicating its potential application for other segmentation tasks in multimodal medical images.