Maximum Entropy Baseline for Integrated Gradients

Hanxiao Tan
DOI: https://doi.org/10.48550/arXiv.2204.05948
2022-04-13
Abstract:Integrated Gradients (IG), one of the most popular explainability methods available, still remains ambiguous in the selection of baseline, which may seriously impair the credibility of the explanations. This study proposes a new uniform baseline, i.e., the Maximum Entropy Baseline, which is consistent with the "uninformative" property of baselines defined in IG. In addition, we propose an improved ablating evaluation approach incorporating the new baseline, where the information conservativeness is maintained. We explain the linear transformation invariance of IG baselines from an information perspective. Finally, we assess the reliability of the explanations generated by different explainability methods and different IG baselines through extensive evaluation experiments.
Machine Learning,Artificial Intelligence
What problem does this paper attempt to address?
The problems that this paper attempts to solve are: **the ambiguity in baseline selection and the reliability of interpretation results in the Integrated Gradients (IG) method**. Specifically, the paper points out: 1. **Ambiguity in baseline selection**: - The IG method relies on a baseline \( x' \) to calculate the integrated gradient from the baseline to the input \( x \). However, the selection of the baseline lacks clear criteria, and different baselines may lead to significant differences in interpretation results. - Currently commonly used baselines (such as zero - padding, black/white vectors, random initialization, etc.) are effective in some cases, but lack a unified quantitative standard to measure their "uninformative" nature. 2. **Reliability assessment of interpretation results**: - The lack of ground truth makes it difficult to assess the reliability of interpretation results. - Existing ablation test methods have deficiencies, for example, there are no unified substitute pixels and information conservation cannot be guaranteed. To solve these problems, the paper proposes the following improvements: - **Maximum Entropy Baseline**: - A new baseline selection method, called the maximum entropy baseline, is proposed to ensure that the baseline maintains an "uninformative" nature. - It is represented by the formula: \[ B_{X_{\text{entr}}} = \arg\max_x H(\text{Softmax}(f_l(x))) \] where \( f_l(x) \) represents the logits output of the model, and \( H(\cdot) \) is the entropy function: \[ H(A) = -\sum_{i = 1}^n P(a_i)\log P(a_i) \] - **Improved ablation test**: - A new entropy - based ablation test method is proposed to ensure information conservation and use unified substitute pixels for evaluation. - By monitoring the entropy of logits as a quantitative indicator of the amount of information, it is ensured that the amount of information in the input after ablation is reduced, which conforms to the formula: \[ I(B_{X_{\text{entr}}}) \approx \frac{1}{H(\sigma(f_l(B_{X_{\text{entr}}})))} \] - **Linear transformation invariance**: - It explains why some baselines perform well under uniform linear transformations but fail under non - uniform linear transformations, and analyzes from an information perspective. Through these improvements, the paper aims to improve the reliability and interpretability of the interpretation results of the IG method.