MSTCNet: Parallel Multi-Scale Network For Medical Image Segmentation.

Hongzhao Xiao,Jie Tang ,Hongjian Song,Juncheng Hu
DOI: https://doi.org/10.1145/3622896.3622917
2023-01-01
Abstract:Transformer-like architectures, which are the model of choice in the field of natural language processing, have recently been adapted to computer vision (CV) fields and demonstrated remarkable effectiveness on various CV tasks. However, current transformer-based methods require large-scale datasets, which are usually unavailable in medical image analysis, thus resulting in adverse achievement. To this end, we propose a novel segmentation model, named MSTCNet, which constructs a parallel multi-scale transformer (MST) encoder in U-Net. In MST, we devise multi-scale patch partition and multi-scale mix attention to perform multi-scale long-range dependencies modeling. The U-Net encoder paralleled with MST alleviates the burden of large-scale datasets and extract local features supplementarily. We also propose Feature Fusion Head to narrow the gap between convolutional features and transformer features. Sufficient experiments demonstrate that our MSTCNet outperforms state-of-the-art methods on GlaS and ISIC18 datasets and is more suitable for medical image segmentation with small-scale datasets.
What problem does this paper attempt to address?