Generating Natural Language Adversarial Examples Based on the Approximating Top-K Combination Token Substitution.

Panfeng Qiu,Xi Wu,Yongxin Zhao
DOI: https://doi.org/10.1109/hpcc-dss-smartcity-dependsys57074.2022.00254
2022-01-01
Abstract:Deep Neural Networks (DNNs) have been widely used in Natural Language Processing (NLP) applications. However, due to the lack of interpretability, recent studies have shown that the DNN-based models used in NLP are vulnerable to adversarial attacks by adding subtle perturbations into inputs. Among the various existing adversarial attack methods, it is still challenging on how to maintain the high similarity between generated adversarial text and the original text while ensuring both grammatical correctness and semantic preservation. In this paper, we propose a novel attack method based on the approximating Top-K combination token substitution to generate adversarial text. We extend the sequential substitution that is commonly used in the existing methods into a combination substitution, and combine it with Monte Carlo simulation to significantly expand the search space. Furthermore, based on the part-of-speech information, we combine the synonym token substitution strategy and the language model based substitution strategy to generate adversarial texts that are semantically consistent with the original texts. Extensive experiments illustrate that our method outperforms previous methods regarding attack efficiency, perturbation rate, and semantic similarity. Moreover, training on adversarial samples generated by our approach can effectively improve the robustness of the model.
What problem does this paper attempt to address?