Abstract:In social media, neural network models have been applied to hate speech detection, sentiment analysis, etc., but neural network models are susceptible to adversarial attacks. For instance, in a text classification task, the attacker elaborately introduces perturbations to the original texts that hardly alter the original semantics in order to trick the model into making different predictions. By studying textual adversarial attack methods, the robustness of language models can be evaluated and then improved. Currently, most of the research in this field focuses on English, and there is also a certain amount of research on Chinese. However, there is little research targeting Chinese minority languages. With the rapid development of artificial intelligence technology and the emergence of Chinese minority language models, textual adversarial attacks become a new challenge for the information processing of Chinese minority languages. In response to this situation, we propose a multi-granularity Tibetan textual adversarial attack method based on masked language models called TSTricker. We utilize the masked language models to generate candidate substitution syllables or words, adopt the scoring mechanism to determine the substitution order, and then conduct the attack method on several fine-tuned victim models. The experimental results show that TSTricker reduces the accuracy of the classification models by more than 28.70% and makes the classification models change the predictions of more than 90.60% of the samples, which has an evidently higher attack effect than the baseline method.

GPSAttack: A Unified Glyphs, Phonetics and Semantics Multi-Modal Attack against Chinese Text Classification Models

Multi-Granularity Tibetan Textual Adversarial Attack Method Based on Masked Language Model

Expanding Scope: Adapting English Adversarial Attacks to Chinese

WordChange: Adversarial Examples Generation Approach for Chinese Text Classification

Chinese adversarial examples generation approach with multi-strategy based on semantic

Generating Natural Language Adversarial Examples on a Large Scale with Generative Models

Pay Attention to the Robustness of Chinese Minority Language Models! Syllable-level Textual Adversarial Attack on Tibetan Script

TextTricker: Loss-based and gradient-based adversarial attacks on text classification models

Visual Attack and Defense on Text

Automatic Generation of Adversarial Readable Chinese Texts

An adversarial-example generation method for Chinese sentiment tendency classification based on audiovisual confusion and contextual association

Towards Evaluating the Robustness of Chinese BERT Classifiers

Bridge the Gap Between CV and NLP! A Gradient-based Textual Adversarial Attack Framework

Generating Adversarial Examples in Chinese Texts Using Sentence-Pieces

BERT-ATTACK: Adversarial Attack Against BERT Using BERT

WordRevert: Adversarial Examples Defence Method for Chinese Text Classification

WordIllusion: An Adversarial Text Generation Algorithm Based on Human Cognitive System

SemAttack: Natural Textual Attacks via Different Semantic Spaces

TEXTSHIELD: Robust Text Classification Based on Multimodal Embedding and Neural Machine Translation

Textual Adversarial Attack As Combinatorial Optimization