Feature Interaction Interpretability and Beyond

Michael Tsang,Yan Liu
2020-01-01
Abstract:We demonstrate the various advantages of interpreting feature interactions in modern prediction models. We leverage our ongoing work on Neural Interaction Detection (NID) [16] to identify interactions on feature perturbations and their inferences through black-box models. As part of this process, we propose an alternate form of NID called GradientNID, which exactly detects relevant interactions in neural network explainer models. Across diverse application domains like image, text
What problem does this paper attempt to address?