SSA: semantic structure aware inference on CNN networks for weakly pixel-wise dense predictions without cost

Yanpeng Sun,Zechao Li
DOI: https://doi.org/10.1007/s11704-024-3571-9
IF: 2.6688
2024-11-27
Frontiers of Computer Science
Abstract:The pixel-wise dense prediction tasks based on weakly supervisions currently use Class Attention Maps (CAMs) to generate pseudo masks as ground-truth. However, existing methods often incorporate trainable modules to expand the immature class activation maps, which can result in significant computational overhead and complicate the training process. In this work, we investigate the semantic structure information concealed within the CNN network, and propose a semantic structure aware inference (SSA) method that utilizes this information to obtain high-quality CAM without any additional training costs. Specifically, the semantic structure modeling module (SSM) is first proposed to generate the class-agnostic semantic correlation representation, where each item denotes the affinity degree between one category of objects and all the others. Then, the immature CAM are refined through a dot product operation that utilizes semantic structure information. Finally, the polished CAMs from different backbone stages are fused as the output. The advantage of SSA lies in its parameter-free nature and the absence of additional training costs, which makes it suitable for various weakly supervised pixel-dense prediction tasks. We conducted extensive experiments on weakly supervised object localization and weakly supervised semantic segmentation, and the results confirm the effectiveness of SSA.
computer science, information systems, theory & methods, software engineering
What problem does this paper attempt to address?