Abstract:Limited by the expensive labeling, polyp segmentation models are plagued by data shortages. To tackle this, we propose the mixed supervised polyp segmentation paradigm (MixPolyp). Unlike traditional models relying on a single type of annotation, MixPolyp combines diverse annotation types (mask, box, and scribble) within a single model, thereby expanding the range of available data and reducing labeling costs. To achieve this, MixPolyp introduces three novel supervision losses to handle various annotations: Subspace Projection loss (L_SP), Binary Minimum Entropy loss (L_BME), and Linear Regularization loss (L_LR). For box annotations, L_SP eliminates shape inconsistencies between the prediction and the supervision. For scribble annotations, L_BME provides supervision for unlabeled pixels through minimum entropy constraint, thereby alleviating supervision sparsity. Furthermore, L_LR provides dense supervision by enforcing consistency among the predictions, thus reducing the non-uniqueness. These losses are independent of the model structure, making them generally applicable. They are used only during training, adding no computational cost during inference. Extensive experiments on five datasets demonstrate MixPolyp's effectiveness.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is the data scarcity and high annotation cost in colon polyp segmentation. Specifically, traditional polyp segmentation models rely on accurate pixel - level annotations, which not only consume a great deal of manpower and time, but also easily lead to insufficient data and over - fitting. To solve these problems, the author proposes a mixed - supervised polyp segmentation model - MixPolyp. ### Main Problems and Solutions 1. **Data Scarcity** - **Problem**: High - quality pixel - level annotated data is very expensive and difficult to obtain, resulting in insufficient training data. - **Solution**: MixPolyp combines multiple types of annotated data (pixel - level, box - level, and stroke - level), thereby expanding the range of available data and reducing the annotation cost. 2. **Annotation Noise and Sparse Supervision** - **Problem**: Box - level and stroke - level annotations have problems of noise and sparse supervision, which may lead to a decline in model performance. - **Solution**: MixPolyp introduces three new types of supervised loss functions: - **Subspace Projection Loss (\(L_{SP}\))**: By projecting the predicted mask and box annotations onto one - dimensional vectors, the shape inconsistency is reduced. - **Binary Minimum Entropy Loss (\(L_{BME}\))**: By minimizing the entropy of unannotated pixels, it provides supervision in sparse supervision areas. - **Linear Regularization Loss (\(L_{LR}\))**: By fusing the predictions of the fully - supervised and weakly - supervised branches, it provides dense supervision and reduces non - uniqueness. ### Model Structure - **Input Image**: The input image \(I\in\mathbb{R}^{H\times W}\) undergoes a backbone network to extract multi - scale features. - **Supervision Branches** - **Fully - Supervised Branch**: Deals with pixel - level annotated data. - **Box - Supervised Branch**: Uses the subspace projection loss \(L_{SP}\) to process box - level annotated data. - **Stroke - Supervised Branch**: Uses the binary minimum entropy loss \(L_{BME}\) to process stroke - level annotated data. - **Linear Regularization Loss**: Introduced in the box - supervised and stroke - supervised branches to ensure prediction consistency. ### Experimental Results - **Datasets**: Experiments were carried out on five datasets, including Kvasir, CVC - ClinicDB, CVC - ColonDB, EndoScene, and ETIS. - **Performance Comparison**: MixPolyp outperforms the existing state - of - the - art methods in multiple metrics, especially achieving 85.9% and 78.5% in the weighted average Dice coefficient and IoU respectively. ### Conclusion MixPolyp effectively solves the problems of data scarcity and high annotation cost in polyp segmentation by combining multiple types of annotated data. The proposed loss functions not only improve the performance of the model but also enhance the generalization ability of the model. Future work will explore more types of annotated data to further improve the ability of the model.

MixPolyp: Integrating Mask, Box and Scribble Supervision for Enhanced Polyp Segmentation

PolypMixNet: Enhancing semi-supervised polyp segmentation with polyp-aware augmentation

WeakPolyp: You Only Look Bounding Box for Polyp Segmentation

ModelMix: A New Model-Mixup Strategy to Minimize Vicinal Risk across Tasks for Few-scribble based Cardiac Segmentation

BoxPolyp:Boost Generalized Polyp Segmentation Using Extra Coarse Bounding Box Annotations

IBoxCLA: Towards Robust Box-supervised Segmentation of Polyp via Improved Box-dice and Contrastive Latent-anchors

Non-equivalent images and pixels: Confidence-aware resampling with meta-learning mixup for polyp segmentation

Polyp-E: Benchmarking the Robustness of Deep Segmentation Models via Polyp Editing

S$^2$ME: Spatial-Spectral Mutual Teaching and Ensemble Learning for Scribble-supervised Polyp Segmentation

MonoBox: Tightness-free Box-supervised Polyp Segmentation using Monotonicity Constraint

Annotation-Efficient Polyp Segmentation via Active Learning

CycleMix: A Holistic Strategy for Medical Image Segmentation from Scribble Supervision

PCLMix: Weakly Supervised Medical Image Segmentation via Pixel-Level Contrastive Learning and Dynamic Mix Augmentation

Harnessing Hard Mixed Samples with Decoupled Regularizer

Decoupled Mixup for Data-efficient Learning

Generalize Polyp Segmentation via Inpainting across Diverse Backgrounds and Pseudo-Mask Refinement

Probabilistic Modeling Ensemble Vision Transformer Improves Complex Polyp Segmentation

MixSup: Mixed-grained Supervision for Label-efficient LiDAR-based 3D Object Detection

Polyp-DDPM: Diffusion-Based Semantic Polyp Synthesis for Enhanced Segmentation

Segment Anything Model-guided Collaborative Learning Network for Scribble-supervised Polyp Segmentation

Polyp-Mamba: A Hybrid Multi-Frequency Perception Gated Selection Network for polyp segmentation