Defense against Adversarial Attacks with an Induced Class

Zhi Xu,Jun Wang,Jian Pu
DOI: https://doi.org/10.1109/IJCNN52387.2021.9533755
2021-01-01
Abstract:Though deep neural networks have succeeded in various real applications, the prediction performance is significantly degraded when facing adversarial attacks. In this work, we investigate the alternation of the prediction distribution pattern under adversarial attacks and argue that such alternation is the primary reason for performance drop. To this end, we propose a simple yet effective method by introducing an induced class to attract the adversarial attack and thus protect the original classes' prediction order. Experiments on two real-world datasets demonstrate that the proposed method can maintain the prediction performance for both natural and adversarial examples.
What problem does this paper attempt to address?