Abstract:Graph Neural Networks (GNNs) have recently been widely adopted in multiple domains. Yet, they are notably vulnerable to adversarial and backdoor attacks. In particular, backdoor attacks based on subgraph insertion have been shown to be effective in graph classification tasks while being stealthy, successfully circumventing various existing defense methods. In this paper, we propose E-SAGE, a novel approach to defending GNN backdoor attacks based on explainability. We find that the malicious edges and benign edges have significant differences in the importance scores for explainability evaluation. Accordingly, E-SAGE adaptively applies an iterative edge pruning process on the graph based on the edge scores. Through extensive experiments, we demonstrate the effectiveness of E-SAGE against state-of-the-art graph backdoor attacks in different attack settings. In addition, we investigate the effectiveness of E-SAGE against adversarial attacks.

What problem does this paper attempt to address?

This paper attempts to address the vulnerability of Graph Neural Networks (GNNs) when facing backdoor attacks. Specifically, the author focuses on backdoor attacks based on sub - graph insertion, which has been proven to be effective and highly covert in graph classification tasks and can successfully bypass existing defense methods. ### Main problems 1. **Covertness and effectiveness of backdoor attacks**: Backdoor attacks embed a trigger in the training data, causing the model to misclassify the target node into the category specified by the attacker when it encounters a sample containing a specific trigger during the prediction stage. Such attacks are particularly covert because their performance on normal data is no different from that of a clean model. 2. **Limitations of existing defense methods**: Existing defense methods based on interpretability have some limitations, such as the need for a clean validation data set and the significant computational cost due to the calculation of interpretability metrics. ### Solutions To solve the above problems, the author proposes E - SAGE, a new defense method based on interpretability. The main contributions of E - SAGE are as follows: 1. **Adaptive edge - pruning algorithm**: E - SAGE adaptively prunes the adversarial sub - graph in the prediction stage by quantifying the importance score of each edge. This method takes advantage of the significant differences between malicious and benign edges in interpretability evaluation. 2. **Expanding the defense scope**: E - SAGE is not only applicable to counter backdoor attacks but also extended to handle adversarial attacks involving multiple sub - graph insertions. To improve computational efficiency, E - SAGE introduces a neighbor - sampling strategy similar to GraphSAGE. ### Method principle The core idea of E - SAGE is to use integrated gradients to explain the changes in model predictions. The specific steps include: - **Integrated gradient calculation**: Quantify the importance of each edge to the prediction through the formula \( \text{IntegratedGrad}_i(x)=(x_i - x'_i)\times\int_0^1\frac{\partial F(x'+\alpha\times(x - x'))}{\partial x_i}d\alpha \) - **Threshold determination and pruning**: Set a threshold \( \beta \) according to the importance score and gradually prune the edges exceeding this threshold. For low - degree nodes, use the sigmoid function to control the threshold to balance accuracy and defense performance. - **Iterative pruning**: For complex attacks with multiple sub - graph insertions, E - SAGE gradually removes malicious edges through an iterative pruning process to ensure the effectiveness of the defense. ### Experimental results Through experiments on multiple data sets and GNN models, E - SAGE demonstrates its ability to significantly reduce the attack success rate while maintaining high accuracy. In particular, for the UGBA attack that other defense methods cannot handle, E - SAGE can accurately identify and remove the embedded trigger. ### Summary E - SAGE provides an effective and efficient defense mechanism that can resist state - of - the - art backdoor attacks on graph neural networks and has wide applicability. Future work will further study how to make the trigger more covert in interpretability methods, thereby increasing the covertness of the attack. --- If you have more questions or need further assistance, please feel free to let me know!

E-SAGE: Explainability-based Defense Against Backdoor Attacks on Graph Neural Networks

Explanatory subgraph attacks against Graph Neural Networks

Identifying Backdoored Graphs in Graph Neural Network Training: An Explanation-Based Approach with Novel Metrics

DMGNN: Detecting and Mitigating Backdoor Attacks in Graph Neural Networks

Explainability-based Backdoor Attacks Against Graph Neural Networks

XGBD: Explanation-Guided Graph Backdoor Detection

Robustness-Inspired Defense Against Backdoor Attacks on Graph Neural Networks

MADE: Graph Backdoor Defense with Masked Unlearning

Explainable Graph Neural Networks Under Fire

Jointly Attacking Graph Neural Network and its Explanations

Graph Neural Backdoor: Fundamentals, Methodologies, Applications, and Future Directions

Explainable AI Security: Exploring Robustness of Graph Neural Networks to Adversarial Attacks

Unnoticeable Backdoor Attacks on Graph Neural Networks

Black-Box Graph Backdoor Defense.

A semantic backdoor attack against Graph Convolutional Networks

Graph Neural Network Explanations are Fragile

"No Matter What You Do": Purifying GNN Models via Backdoor Unlearning

Towards Practical Edge Inference Attacks Against Graph Neural Networks

A backdoor attack against link prediction tasks with graph neural networks

Adversarial Attack on Large Scale Graph

Rethinking Graph Backdoor Attacks: A Distribution-Preserving Perspective