Abstract:Artificial neural networks (ANNs) are powerful tools for data analysis and are particularly suitable for modeling relationships between variables for best prediction of an outcome. While these models can be used to answer many important research questions, their utility has been critically limited because the interpretation of the "black box" model is difficult. Clinical investigators usually employ ANN models to predict the clinical outcomes or to make a diagnosis; the model however is difficult to interpret for clinicians. To address this important shortcoming of neural network modeling methods, we describe several methods to help subject-matter audiences (e.g., clinicians, medical policy makers) understand neural network models. Garson's algorithm describes the relative magnitude of the importance of a descriptor (predictor) in its connection with outcome variables by dissecting the model weights. The Lek's profile method explores the relationship of the outcome variable and a predictor of interest, while holding other predictors at constant values (e.g., minimum, 20th quartile, maximum). While Lek's profile was developed specifically for neural networks, partial dependence plot is a more generic version that visualize the relationship between an outcome and one or two predictors. Finally, the local interpretable model-agnostic explanations (LIME) method can show the predictions of any classification or regression, by approximating it locally with an interpretable model. R code for the implementations of these methods is shown by using example data fitted with a standard, feed-forward neural network model. We offer codes and step-by-step description on how to use these tools to facilitate better understanding of ANN.

Interpret Neural Networks by Extracting Critical Subnetworks

Fooling Neural Network Interpretations - Adversarial Noise to Attack Images.

Interpretability Based Neural Network Repair

Interpretable Disentanglement of Neural Networks by Extracting Class-Specific Subnetwork

Functional Network: A Novel Framework for Interpretability of Deep Neural Networks

Towards Interpreting Recurrent Neural Networks Through Probabilistic Abstraction

Interpret Neural Networks by Identifying Critical Data Routing Paths.

Visualizing and Understanding Neural Models in NLP

Opening the Black Box of Neural Networks: Methods for Interpreting Neural Network Models in Clinical Applications

Analyzing the Noise Robustness of Deep Neural Networks

A Comprehensive Review of Deep Neural Network Interpretation Using Topological Data Analysis

Towards Interpretable Deep Neural Networks by Leveraging Adversarial Examples

Identifying Sub-networks in Neural Networks via Functionally Similar Representations

Improving Network Interpretability via Explanation Consistency Evaluation

Interpreting and Improving Adversarial Robustness of Deep Neural Networks With Neuron Sensitivity

A Survey on Neural Network Interpretability

Neural network interpretability with layer-wise relevance propagation: novel techniques for neuron selection and visualization

Interpreting Adversarial Examples by Activation Promotion and Suppression

Visual Interpretability forDeepLearning

DISCOVER: Making Vision Networks Interpretable via Competition and Dissection

Interpreting and Evaluating Neural Network Robustness