Abstract:Background: Explainable artificial intelligence (XAI) is a technology that can enhance trust in mental state classifications by providing explanations for the reasoning behind artificial intelligence (AI) models outputs, especially for high-dimensional and highly-correlated brain signals. Feature importance and counterfactual explanations are two common approaches to generate these explanations, but both have drawbacks. While feature importance methods, such as shapley additive explanations (SHAP), can be computationally expensive and sensitive to feature correlation, counterfactual explanations only explain a single outcome instead of the entire model. Methods: To overcome these limitations, we propose a new procedure for computing global feature importance that involves aggregating local counterfactual explanations. This approach is specifically tailored to fMRI signals and is based on the hypothesis that instances close to the decision boundary and their counterfactuals mainly differ in the features identified as most important for the downstream classification task. We refer to this proposed feature importance measure as Boundary Crossing Solo Ratio (BoCSoR), since it quantifies the frequency with which a change in each feature in isolation leads to a change in classification outcome, i.e., the crossing of the model's decision boundary. Results and conclusions: Experimental results on synthetic data and real publicly available fMRI data from the Human Connect project show that the proposed BoCSoR measure is more robust to feature correlation and less computationally expensive than state-of-the-art methods. Additionally, it is equally effective in providing an explanation for the behavior of any AI model for brain signals. These properties are crucial for medical decision support systems, where many different features are often extracted from the same physiological measures and a gold standard is absent. Consequently, computing feature importance may become computationally expensive, and there may be a high probability of mutual correlation among features, leading to unreliable results from state-of-the-art XAI methods.

Assessing the Reliability of Machine Learning Explanations in ECG Analysis Through Feature Attribution

Which Neural Network Makes More Explainable Decisions? an Approach Towards Measuring Explainability

Evaluating Feature Attribution Methods for Electrocardiogram

Visual interpretation of deep learning model in ECG classification: A comprehensive evaluation of feature attribution methods

Analysis of a Deep Learning Model for 12-Lead ECG Classification Reveals Learned Features Similar to Diagnostic Criteria

Explaining deep learning for ECG analysis: Building blocks for auditing and knowledge discovery

Pixel-Level Explanation of Multiple Instance Learning Models in Biomedical Single Cell Images

Toward Understanding the Disagreement Problem in Neural Network Feature Attribution

Quantitative and Qualitative Evaluation of Explainable Deep Learning Methods for Ophthalmic Diagnosis

The Rlign Algorithm for Enhanced Electrocardiogram Analysis through R-Peak Alignment for Explainable Classification and Clustering

Visual Interpretable and Explainable Deep Learning Models for Brain Tumor MRI and COVID-19 Chest X-ray Images

Selective Explanations

Are Explanations Helpful? A Comparative Analysis of Explainability Methods in Skin Lesion Classifiers

From local counterfactuals to global feature importance: efficient, robust, and model-agnostic explanations for brain connectivity networks

Advancing Attribution-Based Neural Network Explainability through Relative Absolute Magnitude Layer-Wise Relevance Propagation and Multi-Component Evaluation

A Dual-Perspective Approach to Evaluating Feature Attribution Methods

Assessing the Reliability of Machine Learning Models Applied to the Mental Health Domain Using Explainable AI

Harmonizing Feature Attributions Across Deep Learning Architectures: Enhancing Interpretability and Consistency

Explainable Prediction of Acute Myocardial Infarction Using Machine Learning and Shapley Values

Deep neural networks learn by using human-selected electrocardiogram features and novel features