Learning attributions grounded in existing facts for robust visual explanation

Yulong Wang,Xiaolin Hu,Hang Su
2018-01-01
Abstract:Visual explanation, which aims to interpret the model’s output by attributing to input features, functions as an intuitive form for interpreting black-box models. However, previous works over-emphasize foreground saliency visualization, which either generate similar saliency maps regardless of the class requested for explanation, or generate dispersive artifacts when requested class does not exist in the input image. We present the reasons for these flaws as mode collapse and adversarial noise, which are previously observed in the research of generative models. To overcome these shortcomings, we propose a general principle for perturbation-based optimization by learning attributions grounded in existing facts for robust visual explanation. We further propose a Hierarchical Attribution Fusion (HAF) technique to mitigate the artifacts without extra smoothing regularization. To evaluate the visual explanation more thoroughly, a new evaluation task Distracted Weakly Supervised Object Localization (DWSOL) is proposed to measure whether these methods can correctly attribute the output of requested class to the existing facts. Experiments show that previous methods fail to conform to this criteria, and our principle can help improve the robustness by suppressing false saliency response.
What problem does this paper attempt to address?