MSEDTNet: Multi-Scale Encoder and Decoder with Transformer for Bladder Tumor Segmentation

Yixing Wang,Xiufen Ye
DOI: https://doi.org/10.3390/electronics11203347
IF: 2.9
2022-01-01
Electronics
Abstract:The precise segmentation of bladder tumors from MRI is essential for bladder cancer diagnosis and personalized therapy selection. Limited by the properties of tumor morphology, achieving precise segmentation from MRI images remains challenging. In recent years, deep convolutional neural networks have provided a promising solution for bladder tumor segmentation from MRI. However, deep-learning-based methods still face two weakness: (1) multi-scale feature extraction and utilization are inadequate, being limited by the learning approach. (2) The establishment of explicit long-distance dependence is difficult due to the limited receptive field of convolution kernels. These limitations raise challenges in the learning of global semantic information, which is critical for bladder cancer segmentation. To tackle the problem, a newly auxiliary segmentation algorithm integrating a multi-scale encoder and decoder with a transformer is proposed, which is called MSEDTNet. Specifically, the designed encoder with multi-scale pyramidal convolution (MSPC) is utilized to generate compact feature maps which capture the richly detailed local features of the image. Furthermore, the transformer bottleneck is then leveraged to model the long-distance dependency between high-level tumor semantics from a global space. Finally, a decoder with a spatial context fusion module (SCFM) is adopted to fuse the context information and gradually produce high-resolution segmentation results. The experimental results of T2-weighted MRI scans from 86 patients show that MSEDTNet achieves an overall Jaccard index of 83.46%, a Dice similarity coefficient of 92.35%, and a complexity less than that of other, similar models. This suggests that the method proposed in this article can be used as an efficient tool for clinical bladder cancer segmentation.
What problem does this paper attempt to address?