Weakly-Supervised Concealed Object Segmentation with SAM-based Pseudo Labeling and Multi-scale Feature Grouping

Chunming He,Kai Li,Yachao Zhang,Guoxia Xu,Longxiang Tang,Yulun Zhang,Zhenhua Guo,Xiu Li
2023-05-18
Abstract:Weakly-Supervised Concealed Object Segmentation (WSCOS) aims to segment objects well blended with surrounding environments using sparsely-annotated data for model training. It remains a challenging task since (1) it is hard to distinguish concealed objects from the background due to the intrinsic similarity and (2) the sparsely-annotated training data only provide weak supervision for model learning. In this paper, we propose a new WSCOS method to address these two challenges. To tackle the intrinsic similarity challenge, we design a multi-scale feature grouping module that first groups features at different granularities and then aggregates these grouping results. By grouping similar features together, it encourages segmentation coherence, helping obtain complete segmentation results for both single and multiple-object images. For the weak supervision challenge, we utilize the recently-proposed vision foundation model, Segment Anything Model (SAM), and use the provided sparse annotations as prompts to generate segmentation masks, which are used to train the model. To alleviate the impact of low-quality segmentation masks, we further propose a series of strategies, including multi-augmentation result ensemble, entropy-based pixel-level weighting, and entropy-based image-level selection. These strategies help provide more reliable supervision to train the segmentation model. We verify the effectiveness of our method on various WSCOS tasks, and experiments demonstrate that our method achieves state-of-the-art performance on these tasks.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The problem that this paper attempts to solve is Weakly - Supervised Concealed Object Segmentation (WSCOS). Specifically, the paper focuses on how to use sparsely - annotated data to train the model to identify and segment objects that are highly integrated with their surrounding environment. This task has two major challenges: 1. **Intrinsic Similarity**: There is a high degree of similarity between concealed objects and their backgrounds, which makes it difficult for the model to distinguish between the foreground and the background. 2. **Weak Supervision**: There are only sparse annotation points or lines in the training data, providing limited supervision information, which restricts the learning ability of the model. To solve these problems, the authors propose a new WSCOS method, which mainly includes the following aspects: - **Multi - scale Feature Grouping (MFG)**: By grouping features at different granularities and aggregating these grouping results, the consistency of segmentation is enhanced, so as to obtain more complete single - object or multi - object image segmentation results. - **SAM - based Pseudo - label Generation**: Utilize the recently proposed visual foundation model "Segment Anything Model (SAM)" to generate segmentation masks by using sparse annotations as prompts, which are used as pseudo - labels for training the model. - **Pseudo - label Improvement Strategies**: In order to improve the quality of pseudo - labels, the authors propose a series of strategies, including multi - enhanced result integration, entropy - based pixel - level weighting, and entropy - based image - level selection, to provide more reliable supervision information. Through these methods, the paper has been verified on multiple WSCOS tasks, and the experimental results show that this method has achieved state - of - the - art performance.