A Lightweight Chinese Multimodal Textual Defense Method Based on Contrastive-Adversarial Training

Xiangge Li,Hong Luo,Yan Sun
DOI: https://doi.org/10.1109/ijcnn60899.2024.10649990
2024-01-01
Abstract:Chinese text Classification models are vulnerable to adversarial attacks. Based on Chinese language features such as phonology and glyphs, attackers can modify serval words or characters and perturb the results of the classification model without affecting the semantics of the sentence. Adversarial examples are becoming serious challenges to the robustness of the classification model, even for the state of the art models such as PLMs. Therefore, it is necessary to effectively improve the robustness of text classification models with Chinese language features. In this paper, we propose an efficient defense method CWordDefender to address the adversarial robustness problem in Chinese text classification tasks. We extract multimodal information based on Chinese features and fine-tune PLMs with contrastive-adversarial learning. Experimental results show that CWordDefender is superior to the baseline model by at least 5% in accuracy and has lower infer time.
What problem does this paper attempt to address?