Shifted window-based Transformer with multimodal representation for the systematic staging of rectal cancer

Haoyu Wang,Peihong Li
DOI: https://doi.org/10.1007/s11761-024-00400-3
2024-05-16
Service Oriented Computing and Applications
Abstract:Systematic staging of rectal cancer aims to determine tumor invasion degree and lymph node metastasis (LNM) status. Artificial intelligence technologies can aid physicians in making more accurate therapeutic decisions. Current research on rectal cancer segmentation primarily relies on convolutional neural networks. However, convolution operations' limitations often result in ineffective capture of long-distance dependencies. Moreover, existing LNM diagnosis methods typically necessitate manual extraction of radiomics features from rectal cancer lesions. However, the efficacy of these features heavily depends on the specific dataset employed. In this paper, we propose a Transformer-based multi-modal rectal cancer diagnostic framework. This framework employs the hierarchical feature representation of the Swin Transformer to accurately segment tumors and adaptively extracts multi-scale features for LNM diagnosis. Compared to the current state-of-the-art models, our model has improved the accuracy of tumor segmentation and LNM classification by 3.62% and 4.10%, respectively.
What problem does this paper attempt to address?