Autonomous Learning of Semantic Segmentation from Internet Images

Qibin HOU,Ling-Hao HAN,Jiang-Jiang LIU,Ming-Ming CHENG
DOI: https://doi.org/10.1360/ssi-2020-0146
2021-01-01
Scientia Sinica Informationis
Abstract:Collecting a large amount of manually labeled training data is labor-intensive,thus often becomes the major bottleneckwhen applying semantic segmentation techniquesto real-world applications,especially for new categories where no labeled data is available.In this paper, we aim at solving the problem of “webly-supervised” semanticsegmentation relying purely on web searched images,where users only need to provide a single keyword for each target category.A major challenge in this task is the existence of label noisein web images.To deal with the label noise, we design a noise erasing networkthat is able to learn cross-image knowledge from credible attention regions in imagesin a mini-batch and thenerase those regions unrelated to the search keywords from the web images.With this network, our system can automatically generate high-quality`proxy ground truth', for training semantic segmentation models.Extensive experiments on the popular benchmark,i.e., PASCAL VOC 2012, show surprisingly good resultsin both our task (mIoU = $62.0%$)and the weakly-supervised setting (mIoU = $66.1%$).
What problem does this paper attempt to address?