How Many Annotations Do We Need for Generalizing New-Coming Shadow Images?

Xiao-Diao Chen,Wenyang Yang,Weiyin Ma,Wen Wu
DOI: https://doi.org/10.1109/TCSVT.2023.3263903
2023-11-01
Abstract:Unlabeled data is often used to improve the generalization ability of one segmentation model. However, it tends to neglect the inherent difficulty of unlabeled samples, and then produces inaccurate pseudo masks in some unseen scenes, resulting in severe confirmation bias and potential performance degradation. These motivate two unexplored questions for new-coming data: (1) How many images do we need to annotate; and (2) how to annotate them? In this paper, two kinds of shadow detectors (i.e., SDTR and SDTR+) based on the Transformer and self-training scheme are successively proposed. The main difference between them is whether weak annotations are required for partial unlabeled data. Specifically, in SDTR, we first introduce an image-level sample selection scheme to separate the unlabeled data into reliable and unreliable samples from the holistic prediction-level stability. Then, we perform selective retraining to exploit the unlabeled images progressively in a curriculum learning manner. While in SDTR+, we further provide various weak labels (i.e., point, box and scribble) for the rest unreliable samples and design corresponding loss functions. By doing this, it can achieve a better trade-off between performance improvement and annotation cost. Experimental results on public benchmarks (i.e., SBU, UCF and ISTD) show that both SDTR and SDTR+ can be favorable against state-of-the-art methods.
Computer Science
What problem does this paper attempt to address?