MixPolyp: Integrating Mask, Box and Scribble Supervision for Enhanced Polyp Segmentation

Yiwen Hu,Jun Wei,Yuncheng Jiang,Haoyang Li,Shuguang Cui,Zhen Li,Song Wu
2024-09-25
Abstract:Limited by the expensive labeling, polyp segmentation models are plagued by data shortages. To tackle this, we propose the mixed supervised polyp segmentation paradigm (MixPolyp). Unlike traditional models relying on a single type of annotation, MixPolyp combines diverse annotation types (mask, box, and scribble) within a single model, thereby expanding the range of available data and reducing labeling costs. To achieve this, MixPolyp introduces three novel supervision losses to handle various annotations: Subspace Projection loss (L_SP), Binary Minimum Entropy loss (L_BME), and Linear Regularization loss (L_LR). For box annotations, L_SP eliminates shape inconsistencies between the prediction and the supervision. For scribble annotations, L_BME provides supervision for unlabeled pixels through minimum entropy constraint, thereby alleviating supervision sparsity. Furthermore, L_LR provides dense supervision by enforcing consistency among the predictions, thus reducing the non-uniqueness. These losses are independent of the model structure, making them generally applicable. They are used only during training, adding no computational cost during inference. Extensive experiments on five datasets demonstrate MixPolyp's effectiveness.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the data scarcity and high annotation cost in colon polyp segmentation. Specifically, traditional polyp segmentation models rely on accurate pixel - level annotations, which not only consume a great deal of manpower and time, but also easily lead to insufficient data and over - fitting. To solve these problems, the author proposes a mixed - supervised polyp segmentation model - MixPolyp. ### Main Problems and Solutions 1. **Data Scarcity** - **Problem**: High - quality pixel - level annotated data is very expensive and difficult to obtain, resulting in insufficient training data. - **Solution**: MixPolyp combines multiple types of annotated data (pixel - level, box - level, and stroke - level), thereby expanding the range of available data and reducing the annotation cost. 2. **Annotation Noise and Sparse Supervision** - **Problem**: Box - level and stroke - level annotations have problems of noise and sparse supervision, which may lead to a decline in model performance. - **Solution**: MixPolyp introduces three new types of supervised loss functions: - **Subspace Projection Loss (\(L_{SP}\))**: By projecting the predicted mask and box annotations onto one - dimensional vectors, the shape inconsistency is reduced. - **Binary Minimum Entropy Loss (\(L_{BME}\))**: By minimizing the entropy of unannotated pixels, it provides supervision in sparse supervision areas. - **Linear Regularization Loss (\(L_{LR}\))**: By fusing the predictions of the fully - supervised and weakly - supervised branches, it provides dense supervision and reduces non - uniqueness. ### Model Structure - **Input Image**: The input image \(I\in\mathbb{R}^{H\times W}\) undergoes a backbone network to extract multi - scale features. - **Supervision Branches** - **Fully - Supervised Branch**: Deals with pixel - level annotated data. - **Box - Supervised Branch**: Uses the subspace projection loss \(L_{SP}\) to process box - level annotated data. - **Stroke - Supervised Branch**: Uses the binary minimum entropy loss \(L_{BME}\) to process stroke - level annotated data. - **Linear Regularization Loss**: Introduced in the box - supervised and stroke - supervised branches to ensure prediction consistency. ### Experimental Results - **Datasets**: Experiments were carried out on five datasets, including Kvasir, CVC - ClinicDB, CVC - ColonDB, EndoScene, and ETIS. - **Performance Comparison**: MixPolyp outperforms the existing state - of - the - art methods in multiple metrics, especially achieving 85.9% and 78.5% in the weighted average Dice coefficient and IoU respectively. ### Conclusion MixPolyp effectively solves the problems of data scarcity and high annotation cost in polyp segmentation by combining multiple types of annotated data. The proposed loss functions not only improve the performance of the model but also enhance the generalization ability of the model. Future work will explore more types of annotated data to further improve the ability of the model.