Abstract:Introduction: The application of U-shaped convolutional neural network (CNN) methods in medical image segmentation tasks has yielded impressive results. However, this structure's single-level context information extraction capability can lead to problems such as boundary blurring, so it needs to be improved. Additionally, the convolution operation's inherent locality restricts its ability to capture global and long-distance semantic information interactions effectively. Conversely, the transformer model excels at capturing global information. Methods: Given these considerations, this paper presents a transformer fusion context pyramid medical image segmentation network (CPFTransformer). The CPFTransformer utilizes the Swin Transformer to integrate edge perception for segmentation edges. To effectively fuse global and multi-scale context information, we introduce an Edge-Aware module based on a context pyramid, which specifically emphasizes local features like edges and corners. Our approach employs a layered Swin Transformer with a shifted window mechanism as an encoder to extract contextual features. A decoder based on a symmetric Swin Transformer is employed for upsampling operations, thereby restoring the resolution of feature maps. The encoder and decoder are connected by an Edge-Aware module for the extraction of local features such as edges and corners. Results: Experimental evaluations on the Synapse multi-organ segmentation task and the ACDC dataset demonstrate the effectiveness of our method, yielding a segmentation accuracy of 79.87% (DSC) and 20.83% (HD) in the Synapse multi-organ segmentation task. Discussion: The method proposed in this paper, which combines the context pyramid mechanism and Transformer, enables fast and accurate automatic segmentation of medical images, thereby significantly enhancing the precision and reliability of medical diagnosis. Furthermore, the approach presented in this study can potentially be extended to image segmentation of other organs in the future.

Coformer: Collaborative Transformer for Medical Image Segmentation

Mmformer: Multimodal Medical Transformer for Incomplete Multimodal Learning of Brain Tumor Segmentation

MixFormer: a Mixed CNN-Transformer Backbone for Medical Image Segmentation

ScaleFormer: Revisiting the Transformer-based Backbones from a Scale-wise Perspective for Medical Image Segmentation.

MCRformer: Morphological constraint reticular transformer for 3D medical image segmentation

CoTrFuse: a novel framework by fusing CNN and transformer for medical image segmentation

A Parallelly Contextual Convolutional Transformer for Medical Image Segmentation

CPFTransformer: transformer fusion context pyramid medical image segmentation network

ConvFormer: Combining CNN and Transformer for Medical Image Segmentation

MdcFormer: Transformers Based on Dynamic Weights and Multi-Scale for Medical Image Segmentation

Dual encoder network with transformer-CNN for multi-organ segmentation

HCA-former: Hybrid Convolution Attention Transformer for 3D Medical Image Segmentation

TransFuse: Fusing Transformers and CNNs for Medical Image Segmentation

Context-aware and local-aware fusion with transformer for medical image segmentation

A Dynamic Cross-Scale Transformer with Dual-Compound Representation for 3D Medical Image Segmentation

Hybrid-Fusion Transformer for Multisequence MRI

CoTr: Efficiently Bridging CNN and Transformer for 3D Medical Image Segmentation

CCFNet: Collaborative Cross-Fusion Network for Medical Image Segmentation

MOSformer: Momentum encoder-based inter-slice fusion transformer for medical image segmentation

D-former: a U-shaped Dilated Transformer for 3D medical image segmentation