Defending Machine Reading Comprehension against Question-Targeted Attacks.

Xuanjie Fang,Wei Wang
DOI: https://doi.org/10.1109/IJCNN54540.2023.10191697
2023-01-01
Abstract:Machine reading comprehension (MRC) is used in open-domain question answering (QA) to obtain precise replies from retrieved passages. Meanwhile, pre-trained language models (PLM) have achieved human-level accuracy in various MRC benchmarks thanks to the success of deep learning. However, adversarial attack task has shown that models are easily misled towards small perturbations. In this paper, a model-agnostic adversarial framework, named DQAAT, is proposed to defend against question-targeted attacks. Considering the attacks are diverse from sentence-level to word-level, we generate multi-level adversarial samples for adversarial training, where a locator for finding vulnerable cues based on reinforcement learning is applied to improve attack effects. Experiments on 7 representative PLMs show that DQAAT can defend MRC model against various attacks while maintaining a high level of standard accuracy.
What problem does this paper attempt to address?