Abstract:Convolutional Neural Networks (CNNs) have demonstrated outstanding performance in various domains, such as face recognition, object detection, and image segmentation. However, the lack of transparency and limited interpretability inherent in CNNs pose challenges in fields such as medical diagnosis, autonomous driving, finance, and military applications. Several studies have explored the interpretability of CNNs and proposed various post-hoc interpretable methods. The majority of these methods are feature-based, focusing on the influence of input variables on outputs. Few methods undertake the analysis of parameters in CNNs and their overall structure. To explore the structure of CNNs and intuitively comprehend the role of their internal parameters, we propose an Attribution Graph-based Interpretable method for CNNs (AGIC) which models the overall structure of CNNs as graphs and provides interpretability from global and local perspectives. The runtime parameters of CNNs and feature maps of each image sample are applied to construct attribution graphs (At-GCs), where the convolutional kernels are represented as nodes and the SHAP values between kernel outputs are assigned as edges. These At-GCs are then employed to pretrain a newly designed heterogeneous graph encoder based on Deep Graph Infomax (DGI). To comprehensively delve into the overall structure of CNNs, the pretrained encoder is used for two types of interpretable tasks: (1) a classifier is attached to the pretrained encoder for the classification of At-GCs, revealing the dependency of At-GC's topological characteristics on the image sample categories, and (2) a scoring aggregation (SA) network is constructed to assess the importance of each node in At-GCs, thus reflecting the relative importance of kernels in CNNs. The experimental results indicate that the topological characteristics of At-GC exhibit a dependency on the sample category used in its construction, which reveals that kernels in CNNs show distinct combined activation patterns for processing different image categories, meanwhile, the kernels that receive high scores from SA network are crucial for feature extraction, whereas low-scoring kernels can be pruned without affecting model performance, thereby enhancing the interpretability of CNNs.

DRNet: Dissect and Reconstruct the Convolutional Neural Network Via Interpretable Manners

A Pixel-Level Explainable Approach of Convolutional Neural Networks and Its Application

Deeper Interpretability of Deep Networks

Interpretable Neural Network Decoupling.

Visual Interpretability forDeepLearning

Visualizing Surrogate Decision Trees of Convolutional Neural Networks

CHIP: Channel-wise Disentangled Interpretation of Deep Convolutional Neural Networks

PICNN: A Pathway towards Interpretable Convolutional Neural Networks

Transparent Projection Networks for Interpretable Image Recognition

DISCOVER: Making Vision Networks Interpretable via Competition and Dissection

CNN LEGO: Disassembling and Assembling Convolutional Neural Network

E Pluribus Unum Interpretable Convolutional Neural Networks

Interpretable Disentanglement of Neural Networks by Extracting Class-Specific Subnetwork

Visual Interpretability for Deep Learning: a Survey

Hybrid CNN -Interpreter: Interpret local and global contexts for CNN-based Models

Unsupervised Learning of Neural Networks to Explain Neural Networks (extended abstract)

Unsupervised Learning of Neural Networks to Explain Neural Networks

Disassembling Convolutional Segmentation Network

An attribution graph-based interpretable method for CNNs

An Interpretable CNN for the Segmentation of the Left Ventricle in Cardiac MRI by Real-Time Visualization

Interpretability for Reliable, Efficient, and Self-Cognitive DNNs: from Theories to Applications.