Krait: A Backdoor Attack Against Graph Prompt Tuning

Ying Song,Rita Singh,Balaji Palanisamy
2024-07-18
Abstract:Graph prompt tuning has emerged as a promising paradigm to effectively transfer general graph knowledge from pre-trained models to various downstream tasks, particularly in few-shot contexts. However, its susceptibility to backdoor attacks, where adversaries insert triggers to manipulate outcomes, raises a critical concern. We conduct the first study to investigate such vulnerability, revealing that backdoors can disguise benign graph prompts, thus evading detection. We introduce Krait, a novel graph prompt backdoor. Specifically, we propose a simple yet effective model-agnostic metric called label non-uniformity homophily to select poisoned candidates, significantly reducing computational complexity. To accommodate diverse attack scenarios and advanced attack types, we design three customizable trigger generation methods to craft prompts as triggers. We propose a novel centroid similarity-based loss function to optimize prompt tuning for attack effectiveness and stealthiness. Experiments on four real-world graphs demonstrate that Krait can efficiently embed triggers to merely 0.15% to 2% of training nodes, achieving high attack success rates without sacrificing clean accuracy. Notably, in one-to-one and all-to-one attacks, Krait can achieve 100% attack success rates by poisoning as few as 2 and 22 nodes, respectively. Our experiments further show that Krait remains potent across different transfer cases, attack types, and graph neural network backbones. Additionally, Krait can be successfully extended to the black-box setting, posing more severe threats. Finally, we analyze why Krait can evade both classical and state-of-the-art defenses, and provide practical insights for detecting and mitigating this class of attacks.
Machine Learning,Cryptography and Security
What problem does this paper attempt to address?
### Problems the Paper Attempts to Solve This paper aims to explore the vulnerability of Graph Prompt Tuning to backdoor attacks and proposes a new backdoor attack method called Krait. #### Main Research Questions: 1. **Is the Graph Prompt Tuning model susceptible to backdoor attacks?** - Research indicates that Graph Prompt Tuning indeed faces such security threats. 2. **Can attackers use graph prompts as triggers to exploit this vulnerability for backdoor attacks?** - A method is proposed that allows attackers to generate graph prompts as triggers to carry out backdoor attacks. 3. **How to design an effective and stealthy graph prompt backdoor attack?** - Krait is proposed, a new backdoor attack method targeting Graph Prompt Tuning, which includes three key components: - Label Non-uniformity Homophily metric, used to identify the most vulnerable nodes as poisoning candidates; - Three different trigger generation methods to adapt to various attack scenarios and advanced attack types; - A loss function based on centroid similarity to enhance attack effectiveness and stealthiness. #### Main Contributions: - Designed two node-level homophily metrics and three centroid similarity-based metrics to analyze behavioral changes before and after Graph Prompt Tuning. - Extensive experiments validated the effectiveness and stealthiness of Krait, showing excellent performance under different transmission conditions, attack types, and Graph Neural Network (GNN) backbone models. - Extended Krait to black-box settings, enhancing its practicality. - Pointed out that existing defense mechanisms cannot effectively detect and mitigate Krait attacks, emphasizing the need for protective measures for Graph Prompt Tuning models. ### Summary This paper systematically studies the vulnerability of Graph Prompt Tuning to backdoor attacks for the first time and proposes a new attack method called Krait, which ensures high attack success rates while maintaining stealthiness. This provides important references for future defense measures.