AdvJND: Generating Adversarial Examples with Just Noticeable Difference
Zifei Zhang,Kai Qiao,Lingyun Jiang,Linyuan Wang,Jian Chen,Bin Yan
DOI: https://doi.org/10.1007/978-3-030-62460-6_42
2020-01-01
Abstract:Compared with traditional machine learning models, deep neural networks perform better, especially in image classification tasks. However, they are vulnerable to adversarial examples. Adding small perturbations on examples causes a good-performance model to misclassify the crafted examples, without category differences in the human eyes, and fools deep models successfully. There are two requirements for generating adversarial examples: the attack success rate and image fidelity metrics. Generally, the magnitudes of perturbation are increased to ensure the adversarial examples’ high attack success rate; however, the adversarial examples obtained have poor concealment. To alleviate the tradeoff between the attack success rate and image fidelity, we propose a method named AdvJND, adding visual model coefficients, just noticeable difference, in the constraint of a distortion function when generating adversarial examples. In fact, the visual subjective feeling of the human eyes is added as a priori information, which decides the distribution of perturbations, to improve the image quality of adversarial examples. We tested our method on the FashionMNIST, CIFAR10, and MiniImageNet datasets. Our adversarial examples keep high image quality under slightly decreasing attack success rate. Since our AdvJND algorithm yield gradient distributions that are similar to those of the original inputs, the crafted noise can be hidden in the original inputs, improving the attack concealment significantly.