Dual semantic-guided model for weakly-supervised zero-shot semantic segmentation
Fengli Shen,Zhe-Ming Lu,Ziqian Lu,Zonghui Wang
DOI: https://doi.org/10.1007/s11042-021-11792-1
IF: 2.577
2021-12-22
Multimedia Tools and Applications
Abstract:The major obstacle in semantic segmentation is that it requires a large number of pixel-level labeled data to train an effective model. In order to reduce the cost of annotation, weakly-supervised methods use weaker labels to overcome the need for per-pixel labels, while zero-shot methods transfer the knowledge learned from seen classes to unseen classes to reduce the number of classes that need to be labeled. To further alleviate the burden of annotation, we introduce a more challenging task of Weakly-supervised Zero-shot Semantic Segmentation (WZSS): learning models which only utilize image-level annotation of seen classes to segment images containing unseen objects. To this end, we propose a Dual Semantic-Guided (DSG) model which is double guided by semantic embeddings of classes to obtain classification scores and localization maps. By ignoring the localization maps with low classification scores, our proposed framework can generate prediction segmentation masks. To improve our model’s performance, we propose a simple stochastic selection on semantic embeddings during inference, which explores the difference between image-level class embeddings and pixel-level class embeddings. This simple approach increases our model’s performance in terms of hIoU from 25.9 to 31.8. In addition, compared with some zero-shot semantic segmentation methods, our method delivers better results in terms of hIoU (31.8) and mIoUu$$\text {mIoU}_{{u}}$$ (22.0) on the PASCAL VOC 2012 dataset with less supervision information.
computer science, information systems, theory & methods,engineering, electrical & electronic, software engineering