An Interpretable Deep Classifier for Counterfactual Generation.

Wei Zhang,Brian Barr,John Paisley
DOI: https://doi.org/10.1145/3533271.3561722
2022-01-01
Abstract:Counterfactual explanation has been the core of interpretable machine learning, which requires a trained model to be able to not only infer but also justify its inference. This problem is crucial in many fields, such as fintech and the healthcare industry, where accurate decisions and their justifications are equally important. Many studies have leveraged the power of deep generative models for counterfactual generation. However, most focus on vision data and leave the latent space unsupervised. In this paper, we propose a new and general framework that uses a supervised extension to the Variational Auto-Encoder (VAE) with Normalizing Flow (NF) for simultaneous classification and counterfactual generation. We show experiments on two tabular financial data-sets, Lending Club (LCD) and Give Me Some Credit (GMC), which show that the model can achieve a state-of-art level prediction accuracy while also producing meaningful counterfactual examples to interpret and justify the classifier’s decision.
What problem does this paper attempt to address?