Cross-Patch Relation Enhanced for Weakly Supervised Semantic Segmentation

Huiqing Su,Wenqin Huang,Qingmin Liao,Zongqing Lu
DOI: https://doi.org/10.1109/ijcnn60899.2024.10649952
2024-01-01
Abstract:Weakly Supervised Semantic Segmentation (WSSS) using only image-level labels relies on Class Activation Map (CAM) to produce pixel-level pseudo segmentation labels, but it struggles with limited object region activation, resulting in low-quality annotations. To address this issue, a local-to-global framework is employed to enable the model to capture details from patches randomly cropped from input images. However, the pseudo-masks generated by this approach still have an issue with object incompleteness. We notice that it is caused by the neglect of semantic relations among patches, which capture abundant contextual information. Under this observation, we present a Cross-Patch Relation Enhanced Network to improve the quality of the CAMs, leading to the generation of better pseudo segmentation labels. Specifically, a cross-patch relation attention (including the class-prototype extraction and the class-feature aggregation) is proposed to alleviate the intra-class inconsistency due to variations of contextual information across local patches. The class-prototype extraction module gathers contextual relation from all local class-region embeddings. Besides, class-feature aggregation improves class-level representations of multiple patches through feature aggregation. Extensive experimental results on two public datasets have demonstrated the effectiveness of the proposed method. Our method achieves competitive scores with state-of-the-art methods for weakly supervised semantic segmentation on both PASCAL VOC 2012 and MS-COCO 2014 benchmarks.
What problem does this paper attempt to address?