Counterfactual explanation of Bayesian model uncertainty

Gohar Ali,Feras Al-Obeidat,Abdallah Tubaishat,Tehseen Zia,Muhammad Ilyas,Alvaro Rocha
DOI: https://doi.org/10.1007/s00521-021-06528-z
2021-09-24
Neural Computing and Applications
Abstract:Artificial intelligence systems are becoming ubiquitous in everyday life as well as in high-risk environments, such as autonomous driving, medical treatment, and medicine. The opaque nature of the deep neural network raises concerns about its adoption in high-risk environments. It is important for researchers to explain how these models reach their decisions. Most of the existing methods rely on softmax to explain model decisions. However, softmax is shown to be often misleading, particularly giving unjustified high confidence even for samples far from the training data. To overcome this shortcoming, we propose Bayesian model uncertainty for producing counterfactual explanations. In this paper, we compare the counterfactual explanation of models based on Bayesian uncertainty and softmax score. This work predictively produces minimal important features, which maximally change classifier output to explain the decision-making process of the Bayesian model. We used MNIST and Caltech Bird 2011 datasets for experiments. The results show that the Bayesian model outperforms the softmax model and produces more concise and human-understandable counterfactuals.
computer science, artificial intelligence
What problem does this paper attempt to address?