A Causality Inspired Framework for Model Interpretation

Chenwang Wu,Xiting Wang,Defu Lian,Xing Xie,Enhong Chen
DOI: https://doi.org/10.1145/3580305.3599240
2023-01-01
Abstract:This paper introduces a unified causal lens for understanding representative model interpretation methods. We show that their explanation scores align with the concept of average treatment effect in causal inference, which allows us to evaluate their relative strengths and limitations from a unified causal perspective. Based on our observations, we outline the major challenges in applying causal inference to model interpretation, including identifying common causes that can be generalized across instances and ensuring that explanations provide a complete causal explanation of model predictions. We then present CIMI, a Causality-Inspired Model Interpreter, which addresses these challenges. Our experiments show that CIMI provides more faithful and generalizable explanations with improved sampling efficiency, making it particularly suitable for larger pretrained models.
What problem does this paper attempt to address?