3D Reconstruction-Oriented Fully Automatic Multi-Modal Tumor Segmentation by Dual Attention-Guided VNet

Dongdong Meng,Sheng Li,Bin Sheng,Hao Wu,Suqing Tian,Wenjun Ma,Guoping Wang,Xueqing Yan
DOI: https://doi.org/10.1007/s00371-023-02965-0
IF: 2.835
2023-01-01
The Visual Computer
Abstract:Existing automatic contouring methods for primary nasopharyngeal carcinoma (NPC) and metastatic lymph nodes (MLNs) may suffer from low segmentation accuracy and cannot handle multi-modal images correctly. Furthermore, high inter-patient physiological variations and ineffective multi-modal information fusion pose further difficulties. To address these issues, a 3D reconstruction-oriented fully automatic multi-modal segmentation method has been presented to delineate primary NPC tumors and MLNs via a dual attention-guided VNet. Specifically, we leverage a physiologically-sensitive feature enhancement (PFE) module that emphasizes long-range spatial context information in tumor regions of interest and thereby copes with the variability resulting from inter-patient characteristics. This can help extract the 3D spatial feature and facilitate the high-quality reconstruction of 3D geometry of tumors. Next, we develop a multi-modal feature aggregation (MFA) module to describe multi-scale modality-aware features, exploring the effective information aggregation of multi-modal images. To the best of our knowledge, this is the first fully automatic, highly accurate segmentation framework of the primary NPC tumors and MLNs on combined CT-MR datasets. Experimental results on clinical medical datasets validate the effectiveness of our method, and it outperforms the state-of-the-art methods.
What problem does this paper attempt to address?