Abstract:Back propagation based visualizations have been proposed to interpret deep neural networks (DNNs), some of which produce interpretations with good visual quality. However, there exist doubts about whether these intuitive visualizations are related to the network decisions. Recent studies have confirmed this suspicion by verifying that almost all these modified back-propagation visualizations are not faithful to the model's decision-making process. Besides, these visualizations produce vague "relative importance scores", among which low values can't guarantee to be independent of the final prediction. Hence, it's highly desirable to develop a novel back-propagation framework that guarantees theoretical faithfulness and produces a quantitative attribution score with a clear understanding. To achieve the goal, we resort to mutual information theory to generate the interpretations, studying how much information of output is encoded in each input neuron. The basic idea is to learn a source signal by back-propagation such that the mutual information between input and output should be as much as possible preserved in the mutual information between input and the source signal. In addition, we propose a Mutual Information Preserving Inverse Network, termed MIP-IN, in which the parameters of each layer are recursively trained to learn how to invert. During the inversion, forward Relu operation is adopted to adapt the general interpretations to the specific input. We then empirically demonstrate that the inverted source signal satisfies completeness and minimality property, which are crucial for a faithful interpretation. Furthermore, the empirical study validates the effectiveness of interpretations generated by MIP-IN.

Supposed Maximum Mutual Information for Improving Generalization and Interpretation of Multi-Layered Neural Networks

Mutual Information and Diverse Decoding Improve Neural Machine Translation.

Mutual Information Preserving Back-propagation: Learn to Invert for Faithful Attribution

Rethinking Softmax with Cross-Entropy: Neural Network Classifier as Mutual Information Estimator

Variational Information Maximisation for Intrinsically Motivated Reinforcement Learning

Slicing Mutual Information Generalization Bounds for Neural Networks

Dissecting Deep Learning Networks—Visualizing Mutual Information

InfoNet: Neural Estimation of Mutual Information without Test-Time Optimization

Preserving domain private information via mutual information maximization

Mutual information analysis on non-stationary neuron importance for brain machine interfaces.

Maximal Information Divergence from Statistical Models defined by Neural Networks

Analytic Mutual Information in Bayesian Neural Networks

Beyond Normal: On the Evaluation of Mutual Information Estimators

A Probabilistic Representation of Deep Learning for Improving The Information Theoretic Interpretability

Mutual information estimation for graph convolutional neural networks

Multimodal Representation Learning via Maximization of Local Mutual Information

A robust estimator of mutual information for deep learning interpretability

Going Deeper, Generalizing Better: an Information-Theoretic View for Deep Learning.

Quantifying and Maximizing the Information Flux in Recurrent Neural Networks

Mutual Information Multinomial Estimation

MAXIMUM ENTROPY AND MINIMAL MUTUAL INFORMATION IN A NONLINEAR MODEL