Bilateral Cross-Modal Fusion Network for Multimodal Whole-Body Tumor Segmentation

Tao Yu,Xinrui Zhan,Lei Xiang,Xin Gao,Tao Zhou
DOI: https://doi.org/10.1109/isbi56570.2024.10635400
2024-01-01
Abstract:3D whole-body tumor segmentation plays a critical role in cancer treatment. In relation to this, multimodal fluorodeoxyglucose (FDG) positron emission tomography/computed tomography (PET/CT) has evolved into a standard imaging procedure for the evaluation of widespread cancers. Existing multimodal segmentation methods often employ spatial and channel attention mechanisms to fuse features from different modalities. However, the incorporation of attention mechanisms in modal fusion has the potential to introduce feature interference and distort detailed image features. In this paper, we propose a novel Bilateral Cross-modal Fusion Network (BCFNet) for multimodal 3D whole-body tumor segmentation, which can effectively make use of multimodal information to improve the segmentation performance. Specifically, we propose a Multimodal Enhancement Fusion (MEF) module that integrates the features from PET and CT images, aiming to capture complementary information across modalities. Additionally, a Local Refinement Module (LRM) is presented to enhance the feature representations’ ability in the decoder network. Furthermore, a Feature Aggregation Module (FAM) module is proposed to fully integrate previous features from multimodal data for supplementing the output features. Experimental results demonstrate that our BCFNet outperforms other comparison methods.
What problem does this paper attempt to address?