Abstract:Background and objective Computer-based biomedical image segmentation plays a crucial role in planning of assisted diagnostics and therapy. However, due to the variable size and irregular shape of the segmentation target, it is still a challenge to construct an effective medical image segmentation structure. Recently, hybrid architectures based on convolutional neural networks (CNNs) and transformers were proposed. However, most current backbones directly replace one or all convolutional layers with transformer blocks, regardless of the semantic gap between features. Thus, how to sufficiently and effectively eliminate the semantic gap as well as combine the global and local information is a critical challenge. Methods To address the challenge, we propose a novel structure, called BiU-Net, which integrates CNNs and transformers with a two-stage fusion strategy. In the first fusion stage, called Single-Scale Fusion (SSF) stage, the encoding layers of the CNNs and transformers are coupled, with both having the same feature map size. The SSF stage aims to reconstruct local features based on CNNs and long-range information based on transformers in each encoding block. In the second stage, Multi-Scale Fusion (MSF), BiU-Net interacts with multi-scale features from various encoding layers to eliminate the semantic gap between deep and shallow layers. Furthermore, a Context-Aware Block (CAB) is embedded in the bottleneck to reinforce multi-scale features in the decoder. Results Experiments on four public datasets were conducted. On the BUSI dataset, our BiU-Net achieved 85.50 % on Dice coefficient (Dice), 76.73 % on intersection over union (IoU), and 97.23 % on accuracy (ACC). Compared to the state-of-the-art method, BiU-Net improves Dice by 1.17 %. For the Monuseg dataset, the proposed method attained the highest scores, reaching 80.27 % and 67.22 % for Dice and IoU. The BiU-Net achieves 95.33 % and 81.22 % Dice on the PH2 and DRIVE datasets. Conclusions The results of our experiments showed that BiU-Net transcends existing state-of-the-art methods on four publicly available biomedical datasets. Due to the powerful multi-scale feature extraction ability, our proposed BiU-Net is a versatile medical image segmentation framework for various types of medical images. The source code is released on ( https://github.com/ZYLandy/BiU-Net ).

Integrating prior knowledge into a bibranch pyramid network for medical image segmentation

Integrating Spatial Prior Adapter for Enhancing SAM Performance in Medical Image Segmentation

BRAU-Net++: U-Shaped Hybrid CNN-Transformer Network for Medical Image Segmentation

Pyramid Medical Transformer for Medical Image Segmentation

BMCS-Net: A Bi-directional multi-scale cascaded segmentation network based on transformer-guided feature Aggregation for medical images

J-CaPA : Joint Channel and Pyramid Attention Improves Medical Image Segmentation

Boundary-guided feature integration network with hierarchical transformer for medical image segmentation

TSCA-Net: Transformer based spatial-channel attention segmentation network for medical images

MambaClinix: Hierarchical Gated Convolution and Mamba-Based U-Net for Enhanced 3D Medical Image Segmentation

Cross Pyramid Transformer makes U-net stronger in medical image segmentation

Hybrid CNN-Transformer model for medical image segmentation with pyramid convolution and multi-layer perceptron

CascadeMedSeg: integrating pyramid vision transformer with multi-scale fusion for precise medical image segmentation

MultiTrans: Multi-branch transformer network for medical image segmentation

A novel medical image segmentation approach by using multi-branch segmentation network based on local and global information synchronous learning

MBUTransNet: multi-branch U-shaped network fusion transformer architecture for medical image segmentation

TPAFNet: Transformer-Driven Pyramid Attention Fusion Network for 3D Medical Image Segmentation

CiT-Net: Convolutional Neural Networks Hand in Hand with Vision Transformers for Medical Image Segmentation

CPFTransformer: transformer fusion context pyramid medical image segmentation network

BiU-net: A dual-branch structure based on two-stage fusion strategy for biomedical image segmentation

TEC-Net: Vision Transformer Embrace Convolutional Neural Networks for Medical Image Segmentation

HTC-Net: A hybrid CNN-transformer framework for medical image segmentation