Abstract:Deep learning has achieved remarkable success in processing and managing unstructured data. However, its "black box" nature imposes significant limitations, particularly in sensitive application domains. While existing interpretable machine learning methods address some of these issues, they often fail to adequately consider feature correlations and provide insufficient evaluation of model decision paths. To overcome these challenges, this paper introduces Real Explainer (RealExp), an interpretability computation method that decouples the Shapley Value into individual feature importance and feature correlation importance. By incorporating feature similarity computations, RealExp enhances interpretability by precisely quantifying both individual feature contributions and their interactions, leading to more reliable and nuanced explanations. Additionally, this paper proposes a novel interpretability evaluation criterion focused on elucidating the decision paths of deep learning models, going beyond traditional accuracy-based metrics. Experimental validations on two unstructured data tasks -- image classification and text sentiment analysis -- demonstrate that RealExp significantly outperforms existing methods in interpretability. Case studies further illustrate its practical value: in image classification, RealExp aids in selecting suitable pre-trained models for specific tasks from an interpretability perspective; in text classification, it enables the optimization of models and approximates the performance of a fine-tuned GPT-Ada model using traditional bag-of-words approaches.

What problem does this paper attempt to address?

### What problems does this paper attempt to solve? This paper aims to solve two main problems in the interpretability of deep - learning models: 1. **Insufficient Consideration of Feature Correlation**: - Existing interpretability methods usually do not fully consider the correlation between features when calculating feature importance. For example, the Shapley Value method mainly considers the interaction between features when calculating feature importance, while ignoring the independent contribution of each feature. When there is redundancy or strong collinearity between features, the Shapley Value may underestimate the importance of some features, even if these features occupy important positions in the decision tree. 2. **Inconsistency between Model Decision Path and Human Reasoning Path**: - Existing interpretability methods often only focus on whether the model's decision results are consistent with human expectations, while ignoring whether the model's decision path conforms to human reasoning logic. For example, in the violent scene recognition task, although the regions identified by the model and experts may be the same, the model's reasoning order (from black pants to white shirt and then to fighting actions) is opposite to that of experts, which makes the explanation less reliable. To solve these problems, the author proposes a new interpretability calculation method - **Real Explainer (RealExp)**. Specifically, RealExp improves existing methods in the following ways: - **Redefining the Calculation of Shapley Value**: Decompose the original Shapley Value into two parts, independent contribution and interaction contribution, so as to more accurately quantify the contribution of each feature. - **Introducing Feature Similarity Calculation**: By considering the similarity between features, adjust the calculation of marginal contribution to ensure that feature correlation is fully considered. - **Proposing a New Interpretability Evaluation Criterion**: Combine expert annotation and tau coefficient to evaluate the model's decision path, ensuring that the explanation is not only accurate but also conforms to human reasoning logic. Through these improvements, RealExp can provide more reliable and detailed explanations, especially in high - risk application scenarios such as medical and industrial detection fields, enhancing users' trust in the model.

Explaining the Unexplained: Revealing Hidden Correlations for Better Interpretability

Which Neural Network Makes More Explainable Decisions? an Approach Towards Measuring Explainability

Relevance Inference Based on Direct Contribution: Counterfactual Explanation to Deep Networks for Intelligent Decision-making

Explainable Artificial Intelligence: Understanding, Visualizing and Interpreting Deep Learning Models

Interpretable deep learning: interpretation, interpretability, trustworthiness, and beyond

Interpretable Deep Learning Models: Enhancing Transparency and Trustworthiness in Explainable AI

Improving Network Interpretability via Explanation Consistency Evaluation

A Survey of the Interpretability Aspect of Deep Learning Models

Towards Understanding Sensitive and Decisive Patterns in Explainable AI: A Case Study of Model Interpretation in Geometric Deep Learning

Do Explanations Reflect Decisions? A Machine-centric Strategy to Quantify the Performance of Explainability Algorithms

Improving Interpretability of Deep Neural Networks with Semantic Information

Disentangled Explanations of Neural Network Predictions by Finding Relevant Subspaces

Interpreting Deep Neural Networks Through Variable Importance

Interpretability of deep learning models: A survey of results

Explainable Deep Learning: A Visual Analytics Approach with Transition Matrices

Solving the enigma: Deriving optimal explanations of deep networks

M-Rule: an Enhanced Deep Taylor Decomposition for Multi-model Interpretability

Explainability of Text Processing and Retrieval Methods: A Critical Survey

Explaining Language Models' Predictions with High-Impact Concepts

Sim2Word: Explaining Similarity with Representative Attribute Words via Counterfactual Explanations

Improve Interpretability of Neural Networks Via Sparse Contrastive Coding.