Multiscale Progressive Text Prompt Network for Medical Image Segmentation

Xianjun Han,Qianqian Chen,Zhaoyang Xie,Xuejun Li,Hongyu Yang

2023-07-01

Abstract:The accurate segmentation of medical images is a crucial step in obtaining reliable morphological statistics. However, training a deep neural network for this task requires a large amount of labeled data to ensure high-accuracy results. To address this issue, we propose using progressive text prompts as prior knowledge to guide the segmentation process. Our model consists of two stages. In the first stage, we perform contrastive learning on natural images to pretrain a powerful prior prompt encoder (PPE). This PPE leverages text prior prompts to generate multimodality features. In the second stage, medical image and text prior prompts are sent into the PPE inherited from the first stage to achieve the downstream medical image segmentation task. A multiscale feature fusion block (MSFF) combines the features from the PPE to produce multiscale multimodality features. These two progressive features not only bridge the semantic gap but also improve prediction accuracy. Finally, an UpAttention block refines the predicted results by merging the image and text features. This design provides a simple and accurate way to leverage multiscale progressive text prior prompts for medical image segmentation. Compared with using only images, our model achieves high-quality results with low data annotation costs. Moreover, our model not only has excellent reliability and validity on medical images but also performs well on natural images. The experimental results on different image datasets demonstrate that our model is effective and robust for image segmentation.

Image and Video Processing,Computer Vision and Pattern Recognition

What problem does this paper attempt to address?

### Problems the Paper Attempts to Solve This paper primarily aims to address several key challenges in medical image segmentation, specifically including the following aspects: 1. **High Data Annotation Cost**: Training a deep neural network for medical image segmentation requires a large amount of annotated data to ensure high-precision results. To reduce the need for extensive annotated data, the authors propose a method that uses text prompts as prior knowledge to guide the segmentation process. 2. **Multimodal Information Fusion**: By combining image and text information, the quality of medical image segmentation can be improved. Specifically, text prompts are used to generate multimodal features, which are then fused with image features to enhance the segmentation effect. 3. **Semantic Gap Issue**: Addressing the semantic gap between natural data and medical data. To this end, a two-stage training process is proposed, where contrastive learning pre-training is first conducted on natural images, and then the pre-trained model is applied to the medical image segmentation task. 4. **Efficient Segmentation Model**: Designing an efficient model structure that can capture contextual semantic information while maintaining high precision under limited computational resources. By combining Convolutional Neural Networks (CNN) and Transformer modules, the model balances the ability to extract both local and global features. In summary, this paper aims to propose a new method for medical image segmentation by introducing text prompts and contrastive learning techniques, achieving high-quality segmentation results with lower data annotation costs.

Multiscale Progressive Text Prompt Network for Medical Image Segmentation

Multi-Bottleneck Progressive Propulsion Network for Medical Image Semantic Segmentation with Integrated Macro-Micro Dual-Stage Feature Enhancement and Refinement

Medical Visual Prompting (MVP): A Unified Framework for Versatile and High-Quality Medical Image Segmentation

Spatial-Frequency Dual Progressive Attention Network For Medical Image Segmentation

PE-MED: Prompt Enhancement for Interactive Medical Image Segmentation

Curriculum Prompting Foundation Models for Medical Image Segmentation

Prompting Segment Anything Model with Domain-Adaptive Prototype for Generalizable Medical Image Segmentation

Promise:Prompt-driven 3D Medical Image Segmentation Using Pretrained Image Foundation Models

Progressive Attention Module for Segmentation of Volumetric Medical Images.

ProCNS: Progressive Prototype Calibration and Noise Suppression for Weakly-Supervised Medical Image Segmentation

DmADs-Net: Dense multiscale attention and depth-supervised network for medical image segmentation

Prior Attention Network for Multi-Lesion Segmentation in Medical Images

MSDEnet: Multi-scale detail enhanced network based on human visual system for medical image segmentation

Progressive Vision-Language Prompt for Multi-Organ Multi-Class Cell Semantic Segmentation with Single Branch

Enhancing medical text detection with vision-language pre-training and efficient segmentation

MH-Net: Model-data-driven hybrid-fusion network for medical image segmentation

Collaborative multi-feature extraction and scale-aware semantic information mining for medical image segmentation

Each Test Image Deserves A Specific Prompt: Continual Test-Time Adaptation for 2D Medical Image Segmentation

MpMsCFMA-Net: Multi-path Multi-scale Context Feature Mixup and Aggregation Network for medical image segmentation

CM-SegNet: A Deep Learning-Based Automatic Segmentation Approach for Medical Images by Combining Convolution and Multilayer Perceptron

Multi-scale Feature Pyramid Fusion Network for Medical Image Segmentation