Krait: A Backdoor Attack Against Graph Prompt Tuning

Ying Song,Rita Singh,Balaji Palanisamy

2024-07-18

Abstract:Graph prompt tuning has emerged as a promising paradigm to effectively transfer general graph knowledge from pre-trained models to various downstream tasks, particularly in few-shot contexts. However, its susceptibility to backdoor attacks, where adversaries insert triggers to manipulate outcomes, raises a critical concern. We conduct the first study to investigate such vulnerability, revealing that backdoors can disguise benign graph prompts, thus evading detection. We introduce Krait, a novel graph prompt backdoor. Specifically, we propose a simple yet effective model-agnostic metric called label non-uniformity homophily to select poisoned candidates, significantly reducing computational complexity. To accommodate diverse attack scenarios and advanced attack types, we design three customizable trigger generation methods to craft prompts as triggers. We propose a novel centroid similarity-based loss function to optimize prompt tuning for attack effectiveness and stealthiness. Experiments on four real-world graphs demonstrate that Krait can efficiently embed triggers to merely 0.15% to 2% of training nodes, achieving high attack success rates without sacrificing clean accuracy. Notably, in one-to-one and all-to-one attacks, Krait can achieve 100% attack success rates by poisoning as few as 2 and 22 nodes, respectively. Our experiments further show that Krait remains potent across different transfer cases, attack types, and graph neural network backbones. Additionally, Krait can be successfully extended to the black-box setting, posing more severe threats. Finally, we analyze why Krait can evade both classical and state-of-the-art defenses, and provide practical insights for detecting and mitigating this class of attacks.

Machine Learning,Cryptography and Security

What problem does this paper attempt to address?

### Problems the Paper Attempts to Solve This paper aims to explore the vulnerability of Graph Prompt Tuning to backdoor attacks and proposes a new backdoor attack method called Krait. #### Main Research Questions: 1. **Is the Graph Prompt Tuning model susceptible to backdoor attacks?** - Research indicates that Graph Prompt Tuning indeed faces such security threats. 2. **Can attackers use graph prompts as triggers to exploit this vulnerability for backdoor attacks?** - A method is proposed that allows attackers to generate graph prompts as triggers to carry out backdoor attacks. 3. **How to design an effective and stealthy graph prompt backdoor attack?** - Krait is proposed, a new backdoor attack method targeting Graph Prompt Tuning, which includes three key components: - Label Non-uniformity Homophily metric, used to identify the most vulnerable nodes as poisoning candidates; - Three different trigger generation methods to adapt to various attack scenarios and advanced attack types; - A loss function based on centroid similarity to enhance attack effectiveness and stealthiness. #### Main Contributions: - Designed two node-level homophily metrics and three centroid similarity-based metrics to analyze behavioral changes before and after Graph Prompt Tuning. - Extensive experiments validated the effectiveness and stealthiness of Krait, showing excellent performance under different transmission conditions, attack types, and Graph Neural Network (GNN) backbone models. - Extended Krait to black-box settings, enhancing its practicality. - Pointed out that existing defense mechanisms cannot effectively detect and mitigate Krait attacks, emphasizing the need for protective measures for Graph Prompt Tuning models. ### Summary This paper systematically studies the vulnerability of Graph Prompt Tuning to backdoor attacks for the first time and proposes a new attack method called Krait, which ensures high attack success rates while maintaining stealthiness. This provides important references for future defense measures.

Krait: A Backdoor Attack Against Graph Prompt Tuning

KerbNet: A QoE-aware Kernel-Based Backdoor Attack Framework

ATTEQ-NN: Attention-based QoE-aware Evasive Backdoor Attacks.

Trojan Prompt Attacks on Graph Neural Networks

Cross-Context Backdoor Attacks against Graph Prompt Learning

Rethinking Graph Backdoor Attacks: A Distribution-Preserving Perspective

Defense-as-a-Service: Black-box Shielding against Backdoored Graph Models

Prompt as Triggers for Backdoor Attack: Examining the Vulnerability in Language Models

Unnoticeable Backdoor Attacks on Graph Neural Networks

Neighboring Backdoor Attacks on Graph Convolutional Network

A Clean-graph Backdoor Attack against Graph Convolutional Networks with Poisoned Label Only

Motif-Backdoor: Rethinking the Backdoor Attack on Graph Neural Networks via Motifs

Explainability-based Backdoor Attacks Against Graph Neural Networks

PPT: Backdoor Attacks on Pre-trained Models Via Poisoned Prompt Tuning

Rethinking the Trigger-injecting Position in Graph Backdoor Attack

Poison Ink: Robust and Invisible Backdoor Attack

Shortcuts Arising from Contrast: Effective and Covert Clean-Label Attacks in Prompt-Based Learning

Transferable Graph Backdoor Attack

SABER: Model-agnostic Backdoor Attack on Chain-of-Thought in Neural Code Generation

Not All Prompts Are Secure: A Switchable Backdoor Attack Against Pre-trained Vision Transformers

TARGET: Template-Transferable Backdoor Attack Against Prompt-based NLP Models via GPT4