Abstract:Deep convolutional neural networks (CNNs) have achieved remarkable success in various computer vision tasks. However, the lack of interpretability in these models has raised concerns and hindered their widespread adoption in critical domains. Generating activation maps that highlight the regions contributing to the CNN's decision has emerged as a popular approach to visualize and interpret these models. Nevertheless, existing methods often produce activation maps contaminated with irrelevant background noise or incomplete object activation, limiting their effectiveness in providing meaningful explanations. To address this challenge, we propose Union Class Activation Mapping (UnionCAM), an innovative visual interpretation framework that generates high-quality class activation maps (CAMs) through a novel three-step approach. UnionCAM introduces a weighted fusion strategy that adaptively combines multiple CAMs to create more informative and comprehensive activation maps. First, the denoising module removes background noise from CAMs by using adaptive thresholding. Subsequently, the union module fuses the denoised CAMs with region-based CAMs using a weighted combination scheme to obtain more comprehensive and informative maps, which we refer to as fused CAMs. Lastly, the activation map selection module automatically selects the optimal CAM that offers the best interpretation from the pool of fused CAMs. Extensive experiments on ILSVRC2012 and VOC2007 datasets demonstrate UnionCAM's superior performance over state-of-the-art methods. It effectively suppresses background noise, captures complete object regions, and provides intuitive visual explanations. UnionCAM achieves significant improvements in insertion and deletion scores, outperforming the best baseline. UnionCAM makes notable contributions by introducing a novel denoising strategy, adaptive fusion of CAMs, and an automatic selection mechanism. It bridges the gap between CNN performance and interpretability, providing a valuable tool for understanding and trusting CNN-based systems. UnionCAM has the potential to foster responsible deployment of CNNs in real-world applications.

Decom–CAM: Tell Me What You See, in Details! Feature-Level Interpretation Via Decomposition Class Activation Map

DecomCAM: Advancing Beyond Saliency Maps through Decomposition and Integration

Exclusive Feature Constrained Class Activation Mapping for Better Visual Explanation.

Statistic-CAM: A Gradient-Free Visual Explanations for Deep Convolutional Network

Feature CAM: Interpretable AI in Image Classification

Integrated feature analysis for deep learning interpretation and class activation maps

UnionCAM: enhancing CNN interpretability through denoising, weighted fusion, and selective high-quality class activation mapping

Overview of Class Activation Maps for Visualization Explainability

Feature Activation Map: Visual Explanation of Deep Learning Models for Image Classification

CAManim: Animating end-to-end network activation maps

CR-CAM: Generating explanations for deep neural networks by contrasting and ranking features

Integrative CAM: Adaptive Layer Fusion for Comprehensive Interpretation of CNNs

Towards the Visualization of Aggregated Class Activation Maps to Analyse the Global Contribution of Class Features

CAPE: CAM as a Probabilistic Ensemble for Enhanced DNN Interpretation

Cluster-CAM: Cluster-weighted visual interpretation of CNNs' decision in image classification

A Novel Visual Interpretability for Deep Neural Networks by Optimizing Activation Maps with Perturbation

Shap-CAM: Visual Explanations for Convolutional Neural Networks Based on Shapley Value.

KPCA-CAM: Visual Explainability of Deep Computer Vision Models using Kernel PCA

Extracting Class Activation Maps from Non-Discriminative Features as well

Respond-CAM: Analyzing Deep Models for 3D Imaging Data by Visualizations

Interpretable Deep Convolutional Neural Networks via Meta-learning