Abstract:Currently, attention mechanisms have garnered increasing attention in Graph Neural Networks (GNNs), such as Graph Attention Networks (GATs) and Graph Transformers (GTs). It is not only due to the commendable boost in performance they offer but also its capacity to provide a more lucid rationale for model behaviors, which are often viewed as inscrutable. However, Attention-based GNNs have demonstrated instability in interpretability when subjected to various sources of perturbations during both training and testing phases, including factors like additional edges or nodes. In this paper, we propose a solution to this problem by introducing a novel notion called Faithful Graph Attention-based Interpretation (FGAI). In particular, FGAI has four crucial properties regarding stability and sensitivity to interpretation and final output distribution. Built upon this notion, we propose an efficient methodology for obtaining FGAI, which can be viewed as an ad hoc modification to the canonical Attention-based GNNs. To validate our proposed solution, we introduce two novel metrics tailored for graph interpretation assessment. Experimental results demonstrate that FGAI exhibits superior stability and preserves the interpretability of attention under various forms of perturbations and randomness, which makes FGAI a more faithful and reliable explanation tool.
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the instability of explanations in the attention mechanism of graph neural networks (GNNs) when facing various perturbations. Specifically, although existing attention - based GNNs (such as GAT and GT) have improved model performance and provided clearer explanations, their interpretability becomes unstable when encountering perturbations (such as adding edges or nodes) during the training and testing phases. This instability makes these models difficult to be used as reliable explanation tools.
To solve this problem, the author proposes a new concept named "Faithful Graph Attention - based Interpretation (FGAI)". FGAI has the following four key characteristics:
1. **Explanation Similarity**: The top \(k\) importance weights of FGAI are highly consistent with the original attention mechanism.
2. **Explanation Stability**: FGAI is robust to randomness and perturbations during the training and testing processes.
3. **Prediction Proximity**: The output distribution of FGAI is very close to that of the original attention mechanism to maintain its excellent prediction performance.
4. **Prediction Stability**: The output distribution of FGAI is robust to randomness and perturbations during the training and testing processes.
Based on these four characteristics, the author proposes an effective method to obtain FGAI and designs two new evaluation metrics to verify its performance. The experimental results show that FGAI exhibits better stability and interpretability under various forms of perturbations and randomness, making it a more reliable and faithful explanation tool.
### Formula Summary
1. **Attention Coefficient Calculation**:
\[
e_{ij}=a(Wh'_i, Wh'_j)
\]
where \(W\in\mathbb{R}^{F'\times F'}\) is the weight matrix after shared linear transformation, and \(e_{ij}\) represents the importance of the features of node \(j\) to node \(i\).
2. **Normalized Attention Coefficient**:
\[
w_{ij}=\text{softmax}_j(e_{ij})=\frac{\exp(e_{ij})}{\sum_{k\in N_i}\exp(e_{ik})}
\]
where \(N_i\) is the set of neighbors of node \(i\).
3. **Final Output Distribution**:
\[
y_i = \sigma\left(\sum_{j\in N_i}w_{ij}Wh_j\right)
\]
where \(\sigma\) is a nonlinear activation function.
4. **Define Top - k Overlap**:
\[
T_k(x)=\left\{i:i\in [d]\text{ and }\left|\left\{x_j\geq x_i:j\in [d]\right\}\right|\leq k\right\}
\]
\[
V_k(x,x')=\frac{|T_k(x)\cap T_k(x')|}{k}
\]
5. **Define FGAI**:
- **Explanation Similarity**: \(V_{k_1}(\tilde{w}(h_i), w(h_i))\geq\beta_1\)
- **Explanation Stability**: \(V_{k_2}(\tilde{w}(h_i),\tilde{w}(h_i)+\rho)\geq\beta_2\), for all \(\|\rho\|\leq R_1\)
- **Prediction Proximity**: \(D(y(h_i,\tilde{w}),y(h_i,w))\leq\alpha_1\)
- **Prediction Stability**: \(D(y(h_i,\tilde{w}),y