Attention-enhanced multiscale feature fusion network for pancreas and tumor segmentation

Kaiqi Dong,Peijun Hu,Yan Zhu,Yu Tian,Xiang Li,Tianshu Zhou,Xueli Bai,Tingbo Liang,Jingsong Li
DOI: https://doi.org/10.1002/mp.17385
2024-09-22
Abstract:Background: Accurate pancreas and pancreatic tumor segmentation from abdominal scans is crucial for diagnosing and treating pancreatic diseases. Automated and reliable segmentation algorithms are highly desirable in both clinical practice and research. Purpose: Segmenting the pancreas and tumors is challenging due to their low contrast, irregular morphologies, and variable anatomical locations. Additionally, the substantial difference in size between the pancreas and small tumors makes this task difficult. This paper proposes an attention-enhanced multiscale feature fusion network (AMFF-Net) to address these issues via 3D attention and multiscale context fusion methods. Methods: First, to prevent missed segmentation of tumors, we design the residual depthwise attention modules (RDAMs) to extract global features by expanding receptive fields of shallow layers in the encoder. Second, hybrid transformer modules (HTMs) are proposed to model deep semantic features and suppress irrelevant regions while highlighting critical anatomical characteristics. Additionally, the multiscale feature fusion module (MFFM) fuses adjacent top and bottom scale semantic features to address the size imbalance issue. Results: The proposed AMFF-Net was evaluated on the public MSD dataset, achieving 82.12% DSC for pancreas and 57.00% for tumors. It also demonstrated effective segmentation performance on the NIH and private datasets, outperforming previous State-Of-The-Art (SOTA) methods. Ablation studies verify the effectiveness of RDAMs, HTMs, and MFFM. Conclusions: We propose an effective deep learning network for pancreas and tumor segmentation from abdominal CT scans. The proposed modules can better leverage global dependencies and semantic information and achieve significantly higher accuracy than the previous SOTA methods.
What problem does this paper attempt to address?