Abstract:Automatic segmentation of medical images provides a reliable scientific basis for disease diagnosis and analysis. Notably, most existing methods that combine the strengths of convolutional neural networks (CNNs) and Transformers have made significant progress. However, there are some limitations in the current integration of CNN and Transformer technology in two key aspects. Firstly, most methods either overlook or fail to fully incorporate the complementary nature between local and global features. Secondly, the significance of integrating the multi-scale encoder features from the dual-branch network to enhance the decoding features is often disregarded in methods that combine CNN and Transformer. To address this issue, we present a groundbreaking dual-branch cross-attention fusion network (DCFNet), which efficiently combines the power of Swin Transformer and CNN to generate complementary global and local features. We then designed the Feature Cross-Fusion (FCF) module to efficiently fuse local and global features. In the FCF, the utilization of the Channel-wise Cross-fusion Transformer (CCT) serves the purpose of aggregating multi-scale features, and the Feature Fusion Module (FFM) is employed to effectively aggregate dual-branch prominent feature regions from the spatial perspective. Furthermore, within the decoding phase of the dual-branch network, our proposed Channel Attention Block (CAB) aims to emphasize the significance of the channel features between the up-sampled features and the features generated by the FCF module to enhance the details of the decoding. Experimental results demonstrate that DCFNet exhibits enhanced accuracy in segmentation performance. Compared to other state-of-the-art (SOTA) methods, our segmentation framework exhibits a superior level of competitiveness. DCFNet’s accurate segmentation of medical images can greatly assist medical professionals in making crucial diagnoses of lesion areas in advance.

Hybrid Attention Mechanism of Feature Fusion for Medical Image Segmentation

A Feature Fusion Module Based on Complementary Attention for Medical Image Segmentation

Cross Attention Multi Scale CNN-Transformer Hybrid Encoder is General Medical Image Learner.

An attention mechanism and multi-feature fusion network for medical image segmentation

Dual encoder network with transformer-CNN for multi-organ segmentation

Hybrid-scale Contextual Fusion Network for Medical Image Segmentation

MSMHSA-DeepLab V3+: An Effective Multi-Scale, Multi-Head Self-Attention Network for Dual-Modality Cardiac Medical Image Segmentation

DAMAF: dual attention network with multi-level adaptive complementary fusion for medical image segmentation

STA-Former: enhancing medical image segmentation with Shrinkage Triplet Attention in a hybrid CNN-Transformer model

FIAS: Feature Imbalance-Aware Medical Image Segmentation with Dynamic Fusion and Mixing Attention

ECSFF: Exploring Efficient Cross-Scale Feature Fusion for Medical Image Segmentation.

Multi-Attention Mechanism Medical Image Segmentation Combined with Word Embedding Technology

CASF-Net: Cross-attention and Cross-scale Fusion Network for Medical Image Segmentation

DCFNet: An Effective Dual-Branch Cross-Attention Fusion Network for Medical Image Segmentation

HMDA: A Hybrid Model with Multi-scale Deformable Attention for Medical Image Segmentation

HCA-former: Hybrid Convolution Attention Transformer for 3D Medical Image Segmentation

Two-Stage CNN Whole Heart Segmentation Combining Image Enhanced Attention Mechanism and Metric Classification

Incorporating the hybrid deformable model for improving the performance of abdominal CT segmentation via multi-scale feature fusion network

Sub-pixel multi-scale fusion network for medical image segmentation

[Multi-scale medical image segmentation based on pixel encoding and spatial attention mechanism]

CFNet: A Medical Image Segmentation Method Using the Multi-View Attention Mechanism and Adaptive Fusion Strategy