HCA-former: Hybrid Convolution Attention Transformer for 3D Medical Image Segmentation

Fan Yang,Fan Wang,Pengwei Dong,Bo Wang
DOI: https://doi.org/10.1016/j.bspc.2023.105834
IF: 5.1
2024-01-01
Biomedical Signal Processing and Control
Abstract:Using medical image segmentation algorithm to automatically label diseased organs or tissues can effectively help doctors to diagnose and treat various diseases. Most segmentation models are designed based on convolution. However, due to the inherent limitations of convolution operation, its receptive field is usually limited and cannot capture the global information. With transformer, it can capture long distance dependencies, but the lack of local detail can lead to limited predictive capability. Therefore, aiming at the above problems, we propose a segmentation method based on hybrid transformer and convolution. It effectively merges the classic convolution layer with the transformer layer to help guide accurate segmentation. Specifically, a lightweight convolutional layer is first used to extract low-level features and reduce the amount of data. This is followed by a mix of transformer blocks and convolution blocks to aid in the extraction of high-level information. This hybrid convolution layer and transformer layer help to improve the generalization and robustness of the learned representation. In the decoding part, the feature maps of adjacent dimensions are up-sampled into higher resolution feature maps and sent to transformer layer for decoding. The cross-fusion attention mechanism is used to adaptively screen the valid features from the encoded part by reducing the semantic gap between the feature maps of the encoder and decoder subnets. We have done full experiments on Synapse dataset and proved that the proposed method has great competitiveness and advantages compared with other segmentation methods.
What problem does this paper attempt to address?