Message Injection Attack on Rumor Detection under the Black-Box Evasion Setting Using Large Language Model

Dacheng Wen,Yifeng Luo,Yupeng Li,Liang Lan
DOI: https://doi.org/10.1145/3589334.3648139
2024-05-13
Abstract:Recent analyses have disclosed that existing rumor detection techniques, despite playing a pivotal role in countering the dissemination of misinformation on social media, are vulnerable to both white-box and surrogate-based black-box adversarial attacks. However, such attacks depend heavily on unrealistic assumptions, e.g., modifiable user data and white-box access to the rumor detection models, or appropriate selections of surrogate models, which are impractical in the real world. Thus, existing analyses fail to uncover the robustness of rumor detectors in practice. In this work, we take a further step towards the investigation about the robustness of existing rumor detection solutions. Specifically, we focus on the state-of-the-art rumor detectors, which leverage graph neural network based models to predict whether a post is rumor based on the Message Propagation Tree (MPT), a conversation tree with the post as its root and the replies to the post as the descendants of the root. We propose a novel black-box attack method, HMIA-LLM, against these rumor detectors, which uses the Large Language Model to generate malicious messages and inject them into the targeted MPTs. Our extensive evaluation conducted across three rumor detection datasets, four target rumor detectors, and three baselines for comparison demonstrates the effectiveness of our proposed attack method in compromising the performance of the state-of-the-art rumor detectors.
Computer Science
What problem does this paper attempt to address?