Integrating prior knowledge into a bibranch pyramid network for medical image segmentation

Xianjun Han,Tiantian Li,Can Bai,Hongyu Yang
DOI: https://doi.org/10.1016/j.imavis.2024.104945
IF: 3.86
2024-02-19
Image and Vision Computing
Abstract:Medical image segmentation is crucial for obtaining accurate diagnoses, and while convolutional neural network (CNN)-based methods have made strides in recent years, they struggle with modeling long-range dependencies. Transformer-based methods improve this task but require more computational resources. The segment anything model (SAM) can generate pixel-level segmentation results for natural images using sparse manual prompts, but it performs poorly on low-contrast, noisy ultrasound images. To address this issue, we propose a new medical image segmentation network architecture that integrates transformer components, CNN modules, and an SAM encoder into a unified framework. This allows us to simultaneously capture both long-range dependencies and local features. Additionally, we incorporate the image features extracted from the SAM model as prior knowledge to achieve further improved segmentation accuracy with limited training data. To reduce the imposed computational stress, we employ an axial attention mechanism to approximate a transformer's effects by expanding the receptive field. Instead of replacing the transformer components with lightweight attention modules, our model is divided into a global branch and a local branch. The global branch extracts context features with the transformer components, while the local branch processes patch tokens with the axial attention mechanism. We also construct an image pyramid to excavate internal statistics and multiscale representations to obtain more accurate segmentation regions. This bibranch pyramid transformer (Bi-BPT) architecture is effective and robust for medical image segmentation, surpassing other related segmentation network architectures. The experimental results obtained on various medical image datasets demonstrate its effectiveness.
computer science, artificial intelligence, theory & methods,engineering, electrical & electronic, software engineering,optics
What problem does this paper attempt to address?