FastTextDodger: Decision-Based Adversarial Attack Against Black-Box NLP Models With Extremely High Efficiency

Xiaoxue Hu,Geling Liu,Baolin Zheng,Lingchen Zhao,Qian Wang,Yufei Zhang,Minxin Du
DOI: https://doi.org/10.1109/tifs.2024.3350376
IF: 7.231
2024-02-02
IEEE Transactions on Information Forensics and Security
Abstract:Recently, achieving query-efficient adversarial example attacks targeting black-box natural language models has attracted widespread attention from researchers. This task is considered difficult due to the discrete nature of texts, limited knowledge of the target model, and strict query access limitations in real-world systems. However, existing attacks often require a large number of queries or result in low attack success rates, having not met practical requirements. To address this, we propose FastTextDodger, a simple and compact decision-based black-box textual adversarial attack that generates grammatically correct adversarial texts with high attack success rates and few queries. Experimental results show that FastTextDodger achieves an impressive 97.4% attack success rate on benchmark datasets and models, and only needs about 200 queries. Compared to state-of-the-art attacks, FastTextDodger only requires one-tenth of the number of queries in text classification and entailment tasks while maintaining comparable attack success rates and perturbed word rates.
computer science, theory & methods,engineering, electrical & electronic
What problem does this paper attempt to address?