CTC-Net: A Novel Coupled Feature-Enhanced Transformer and Inverted Convolution Network for Medical Image Segmentation

Shixiang Zhang,Yang Xu,Zebin Wu,Zhihui Wei
DOI: https://doi.org/10.1007/978-3-031-47637-2_21
2023-01-01
Abstract:In recent years, the Vision Transformer has gradually replaced the CNN as the mainstream method in the field of medical image segmentation due to its powerful long-range dependencies modeling ability. However, the segmentation network leveraging pure transformer performs poor in feature expression because of the lack of convolutional locality. Besides, the channel dimension information are lost in the network. In this paper, we propose a novel segmentation network termed CTC-Net to address these problems. Specifically, we design a feature-enhanced transformer module with spatial-reduction attention to extract the region details in the image patches by the depth-wise convolution. Then, the point-wise convolution is leveraged to capture non-linear relationship in the channel dimension. Furthermore, a parallel convolutional encoder branch and an inverted residual coordinate attention block are designed to mine the clear dependencies of local context, channel dimension features and location information. Extensive experiments on Synapse Multi-organ CT and ACDC (Automatic Cardiac Diagnosis Challenge) datasets show that our method outperforms the methods based on CNN and pure transformers, obtaining up to 1.72 $$\%$$ and 0.68 $$\%$$ improvement in DSC scores respectively.
What problem does this paper attempt to address?