Abstract:The non-transparency of artificial intelligence (AI) systems, particularly in deep learning (DL), poses significant challenges to their comprehensibility and trustworthiness. This study aims to enhance the explainability of DL models through visual analytics (VA) and human-in-the-loop (HITL) principles, making these systems more transparent and understandable to end users. In this work, we propose a novel approach that utilizes a transition matrix to interpret results from DL models through more comprehensible machine learning (ML) models. The methodology involves constructing a transition matrix between the feature spaces of DL and ML models as formal and mental models, respectively, improving the explainability for classification tasks. We validated our approach with computational experiments on the MNIST, FNC-1, and Iris datasets using a qualitative and quantitative comparison criterion, that is, how different the results obtained by our approach are from the ground truth of the training and testing samples. The proposed approach significantly enhanced model clarity and understanding in the MNIST dataset, with SSIM and PSNR values of 0.697 and 17.94, respectively, showcasing high-fidelity reconstructions. Moreover, achieving an F1m score of 77.76% and a weighted accuracy of 89.38%, our approach proved its effectiveness in stance detection with the FNC-1 dataset, complemented by its ability to explain key textual nuances. For the Iris dataset, the separating hyperplane constructed based on the proposed approach allowed for enhancing classification accuracy. Overall, using VA, HITL principles, and a transition matrix, our approach significantly improves the explainability of DL models without compromising their performance, marking a step forward in developing more transparent and trustworthy AI systems.

Transparency and Explanation in Deep Reinforcement Learning Neural Networks

Verbal Explanations for Deep Reinforcement Learning Neural Networks with Attention on Extracted Features.

Explainable Deep Reinforcement Learning: State of the Art and Challenges

A Survey on Explainable Reinforcement Learning: Concepts, Algorithms, Challenges

Interpretable Deep Learning Models: Enhancing Transparency and Trustworthiness in Explainable AI

How Do You Act? An Empirical Study to Understand Behavior of Deep Reinforcement Learning Agents

Explainable Deep Learning: A Visual Analytics Approach with Transition Matrices

Integrated Commonsense Reasoning and Deep Learning for Transparent Decision Making in Robotics

Neural Reasoning Networks: Efficient Interpretable Neural Networks With Automatic Textual Explanations

Explainable Artificial Intelligence (XAI) for Increasing User Trust in Deep Reinforcement Learning Driven Autonomous Systems

Increasing Transparency of Reinforcement Learning using Shielding for Human Preferences and Explanations

Improving Transparency of Deep Neural Inference Process

Explanation of Reinforcement Learning Model in Dynamic Multi-Agent System

Toward Transparent AI: A Survey on Interpreting the Inner Structures of Deep Neural Networks

Neural network interpretability with layer-wise relevance propagation: novel techniques for neuron selection and visualization

A Peek Into the Reasoning of Neural Networks: Interpreting with Structural Visual Concepts

Reason induced visual attention for explainable autonomous driving

Explainable Artificial Intelligence: Understanding, Visualizing and Interpreting Deep Learning Models

Toward Transparent AI for Neurological Disorders: A Feature Extraction and Relevance Analysis Framework

Self-Supervised Discovering of Interpretable Features for Reinforcement Learning

Why the Agent Made that Decision: Explaining Deep Reinforcement Learning with Vision Masks