Boat in the Sky: Background Decoupling and Object-aware Pooling for Weakly Supervised Semantic Segmentation

Jianjun Xu,Hongtao Xie,Hai Xu,Yuxin Wang,Sun-ao Liu,Yongdong Zhang
DOI: https://doi.org/10.1145/3503161.3548201
2022-01-01
Abstract:Previous image-level weakly-supervised semantic segmentation methods based on Class Activation Map (CAM) have two limitations: 1) focusing on partial discriminative foreground regions and 2) containing undesirable background. The above issues are attributed to the spurious correlations between the object and background (semantic ambiguity) and the insufficient spatial perception ability of the classification network (spatial ambiguity). In this work, we propose a novel self-supervised framework to mitigate the semantic and spatial ambiguity from the perspectives of background bias and object perception. First, a background decoupling mechanism (BDM) is proposed to handle the semantic ambiguity by regularizing the consistency of predicted CAMs from the samples with identical foregrounds but different backgrounds. Thus, a decoupled relationship is constructed to reduce the dependence between the object instance and the scene information. Second, a global object-aware pooling (GOP) is introduced to alleviate spatial ambiguity. The GOP utilizes a learnable object-aware map to dynamically aggregate spatial information and further improve the performance of CAMs. Extensive experiments demonstrate the effectiveness of our method by achieving new state-of-the-art results on both the Pascal VOC 2012 and MS COCO 2014 datasets.
What problem does this paper attempt to address?