Abstract:Although complex machine learning models (eg, random forest, neural networks) are commonly outperforming the traditional and simple interpretable models (eg, linear regression, decision tree), in the healthcare domain, clinicians find it hard to understand and trust these complex models due to the lack of intuition and explanation of their predictions. With the new general data protection regulation (GDPR), the importance for plausibility and verifiability of the predictions made by machine learning models has become essential. Hence, interpretability techniques for machine learning models are an area focus of research. In general, the main aim of these interpretability techniques is to shed light and provide insights into the prediction process of the machine learning models and to be able to explain how the results from the prediction was generated. A major problem in this context is that both the quality of the interpretability techniques and trust of the machine learning model predictions are challenging to measure. In this article, we propose four fundamental quantitative measures for assessing the quality of interpretability techniques—similarity, bias detection, execution time, and trust. We present a comprehensive experimental evaluation of six recent and popular local model agnostic interpretability techniques, namely, LIME, SHAP, Anchors, LORE, ILIME" and MAPLE on different types of real‐world healthcare data. Building on previous work, our experimental evaluation covers different aspects for its comparison including identity, stability, separability, similarity, execution time, bias detection, and trust. The results of our experiments show that MAPLE achieves the highest performance for the identity across all data sets included in this study, while LIME achieves the lowest performance for the identity metric. LIME achieves the highest performance for the separability metric across all data sets. On average, SHAP has the smallest average time to output explanation across all data sets included in this study. For detecting the bias, SHAP and MAPLE enable the participants to better detect the bias. For the trust metric, Anchors achieves the highest performance on all data sets included in this work.

Global and local interpretability techniques of supervised machine learning black box models for numerical medical data

Interpretability in healthcare: A comparative study of local machine learning interpretability techniques

Interpretability of machine learning‐based prediction models in healthcare

Opening the Black Box of Neural Networks: Methods for Interpreting Neural Network Models in Clinical Applications

Transparency of deep neural networks for medical image analysis: A review of interpretability methods

Local Interpretability of Calibrated Prediction Models: A Case of Type 2 Diabetes Mellitus Screening Test

The importance of interpretability and visualization in machine learning for applications in medicine and health care

Global and local interpretation of black-box machine learning models to determine prognostic factors from early COVID-19 data

Explainable AI for Healthcare: A Study for Interpreting Diabetes Prediction

Machine learning in medicine: should the pursuit of enhanced interpretability be abandoned?

Interpretable and explainable machine learning: A methods‐centric overview with concrete examples

Investigating Poor Performance Regions of Black Boxes: LIME-based Exploration in Sepsis Detection

Explainable Artificial Intelligence for Human Decision-Support System in Medical Domain

A survey on the interpretability of deep learning in medical diagnosis

A critical moment in machine learning in medicine: on reproducible and interpretable learning

Interpretable machine learning for dermatological disease detection: Bridging the gap between accuracy and explainability

SHAMSUL: Systematic Holistic Analysis to investigate Medical Significance Utilizing Local interpretability methods in deep learning for chest radiography pathology prediction

Comparative analysis of explainable machine learning prediction models for hospital mortality

Improving Interpretability of Deep Neural Networks in Medical Diagnosis by Investigating the Individual Units

Explainability in CNN based Deep Learning models for medical image classification

Interpretability methods of machine learning algorithms with applications in breast cancer diagnosis