A Lightweight Multi-Scale Multi-Angle Dynamic Interactive Transformer-CNN Fusion Model for 3D Medical Image Segmentation

Xin Hua,Zhijiang Du,Hongjian Yu,Jixin Ma,Fanjun Zheng,Chen Zhang,Qiaohui Lu,Hui Zhao
DOI: https://doi.org/10.1016/j.neucom.2024.128417
IF: 6
2024-01-01
Neurocomputing
Abstract:Combining Convolutional Neural Network(CNN) and Transformer has become one of the mainstream methods for three-dimensional (3D) medical image segmentation. However, the complexity and diversity of target forms in 3D medical images require models to capture complex feature information for segmentation, resulting in an excessive number of parameters which are not conducive to training and deployment. Therefore, we have developed a lightweight 3D multi-target semantic segmentation model. In order to enhance contextual texture connections and reinforce the expression of detailed feature information, we designed a multi-scale and multi-angle feature interaction module to enhance feature representation by interacting multi-scale features from different perspectives. To address the issue of attention collapse in Transformers, leading to the neglect of other detailed feature learning, we utilized local features as dynamic parameters to interact with global features, dynamically grouping and learning critical features from global features, thereby enhancing the model's ability to learn detailed features. While ensuring the segmentation capability of the model, we aimed to keep the model lightweight, resulting in a total of 9.63M parameters. Extensive experiments were conducted on public datasets ACDC and Brats2018, as well as a private dataset, Temporal Bone CT. The results indicate that our proposed model is more competitive compared to the latest techniques in 3D medical image segmentation.
What problem does this paper attempt to address?