Abstract:Deep convolutional neural networks (CNNs) have achieved remarkable success in various computer vision tasks. However, the lack of interpretability in these models has raised concerns and hindered their widespread adoption in critical domains. Generating activation maps that highlight the regions contributing to the CNN's decision has emerged as a popular approach to visualize and interpret these models. Nevertheless, existing methods often produce activation maps contaminated with irrelevant background noise or incomplete object activation, limiting their effectiveness in providing meaningful explanations. To address this challenge, we propose Union Class Activation Mapping (UnionCAM), an innovative visual interpretation framework that generates high-quality class activation maps (CAMs) through a novel three-step approach. UnionCAM introduces a weighted fusion strategy that adaptively combines multiple CAMs to create more informative and comprehensive activation maps. First, the denoising module removes background noise from CAMs by using adaptive thresholding. Subsequently, the union module fuses the denoised CAMs with region-based CAMs using a weighted combination scheme to obtain more comprehensive and informative maps, which we refer to as fused CAMs. Lastly, the activation map selection module automatically selects the optimal CAM that offers the best interpretation from the pool of fused CAMs. Extensive experiments on ILSVRC2012 and VOC2007 datasets demonstrate UnionCAM's superior performance over state-of-the-art methods. It effectively suppresses background noise, captures complete object regions, and provides intuitive visual explanations. UnionCAM achieves significant improvements in insertion and deletion scores, outperforming the best baseline. UnionCAM makes notable contributions by introducing a novel denoising strategy, adaptive fusion of CAMs, and an automatic selection mechanism. It bridges the gap between CNN performance and interpretability, providing a valuable tool for understanding and trusting CNN-based systems. UnionCAM has the potential to foster responsible deployment of CNNs in real-world applications.

Category boundary re-decision by component labels to improve generation of class activation map

Exclusive Feature Constrained Class Activation Mapping for Better Visual Explanation.

Mining Larger Class Activation Map with Common Attribute Labels

Statistic-CAM: A Gradient-Free Visual Explanations for Deep Convolutional Network

G-CAM: Graph Convolution Network Based Class Activation Mapping for Multi-label Image Recognition.

Class Activation Map Generation by Representative Class Selection and Multi-Layer Feature Fusion

DCAM: Disturbed Class Activation Maps for Weakly Supervised Semantic Segmentation

CR-CAM: Generating explanations for deep neural networks by contrasting and ranking features

Extracting Class Activation Maps from Non-Discriminative Features as well

Class Activation Map Generation by Multiple Level Class Grouping and Orthogonal Constraint

HAM: Hybrid Attention Module in Deep Convolutional Neural Networks for Image Classification

LayerCAM: Exploring Hierarchical Class Activation Maps for Localization

Hierarchical Class Grouping with Orthogonal Constraint for Class Activation Map Generation

CAM-loss: Towards Learning Spatially Discriminative Feature Representations

Feature Activation Map: Visual Explanation of Deep Learning Models for Image Classification

Class Re-Activation Maps for Weakly-Supervised Semantic Segmentation

UnionCAM: enhancing CNN interpretability through denoising, weighted fusion, and selective high-quality class activation mapping

BroadCAM: Outcome-agnostic Class Activation Mapping for Small-scale Weakly Supervised Applications

DecomCAM: Advancing Beyond Saliency Maps through Decomposition and Integration

Exploit CAM by Itself: Complementary Learning System for Weakly Supervised Semantic Segmentation

Exploit CAM by itself: Complementary Learning System for Weakly Supervised Semantic Segmentation