ResMT: A Hybrid CNN-transformer Framework for Glioma Grading with 3D MRI

Honghao Cui,Zhuoying Ruan,Zhijian Xu,Xiao Luo,Jian Dai,Daoying Geng
DOI: https://doi.org/10.1016/j.compeleceng.2024.109745
2024-01-01
Abstract:Accurate grading of gliomas is crucial for treatment strategies and prognosis. While convolutional neural networks (CNNs) have proven effective in classifying medical images, they struggle with capturing long-range dependencies among pixels. Transformer-based networks can address this issue, but CNN-based methods often perform better when trained on small datasets. Additionally, tumor segmentation is essential for classification models, but training an additional segmentation model significantly increases workload. To address these challenges, we propose ResMT, which combines CNN and transformer architectures for glioma grading, extracting both local and global features efficiently. Specifically, we designed a spatial residual module (SRM) where a 3D CNN captures glioma's volumetric complexity, and Swin UNETR, a pre-trained segmentation model, enhances the network without extra training. Our model also includes a multi-plane channel and spatial attention module (MCSA) to refine the analysis by focusing on critical features across multiple planes (axial, coronal, and sagittal). Transformer blocks establish long-range relationships among planes and slices. We evaluated ResMT on the BraTs19 dataset, comparing it with baselines and state-of-the-art models. Results demonstrate that ResMT achieves the highest prediction performance with an AUC of 0.9953, highlighting hybrid CNN-transformer models' potential for 3D MRI classification.
What problem does this paper attempt to address?