An adversarial-example generation method for Chinese sentiment tendency classification based on audiovisual confusion and contextual association
Hongxu Ou,Long Yu,Shengwei Tian,Xin Chen,Chen Shi,Bo Wang,Tiejun Zhou
DOI: https://doi.org/10.1007/s10115-023-01946-y
IF: 2.7
2023-08-10
Knowledge and Information Systems
Abstract:The generation methods of adversarial examples have been more explored on English data, while the research papers on Chinese adversarial examples are very limited. At the same time, the existing Chinese adversarial attack methods are often characterized by a single form of generation and not rich enough expression. And the attack effect of these methods still has room for improvement. Therefore, this paper proposes SentiAttack, a method to introduce 6 perturbations from two perspectives, according to the characteristics of Chinese. The 6 types of perturbation were obtained from both audiovisual deception (words with similar sound, Chinese characters with similar form, horizontal splitting of Chinese character and reverse order of adjacent Chinese characters within word) and contextualized generation (WoBERT-MLM (Su in Wobert: Word-based chinese bert model - zhuiyiai. Technical report, 2020. https://github.com/ZhuiyiTechnology/WoBERT) word generation and LongLM (Guan et al. in Trans Assoc Comput Linguist 10:434–451, 2022. https://doi.org/10.1162/tacl_a_00469) sentence-piece generation), respectively. In addition, a "fluency" metric is added to further measure the quality of the adversarial examples. We conducted experiments on five datasets (CH-SIMS 3, ChnSentiCorp, online shopping, waimai, and weibo8). With the effective constraints of semantic similarity, expression fluency and perturbation, we obtained 74.40%, 49.10%, 42.90%, 39.90% and 66.20% accuracy decrease, respectively.
computer science, information systems, artificial intelligence