RegionSparse: Leveraging Sparse Coding and Object Localization to Counter Adversarial Attacks

Yunjian Zhang,Yanwei Liu,Liming Wang,Zhen Xu,Qiuqing Jin
DOI: https://doi.org/10.1109/ijcnn48605.2020.9207050
2020-01-01
Abstract:Although deep neural networks have demonstrated exceptional performance in substantial computer vision tasks, they can be easily confused by carefully generated adversarial examples. Via a novel technique we call activation visualization, the particular characteristics of adversarial examples are analyzed in this paper. Observing that the dominant features of adversarial examples are distributed over a high-dimensional space, we propose a defense framework named RegionSparse that projects the images into a low-dimensional space to remove the influence of the adversarial perturbations on the performance of deep neural networks. In RegionSparse, after training a robust global dictionary, the region where pixels are highly related to classification is firstly located by an object localization mechanism, then the sparse coding is performed on the located object region, together with a perturbation suppression for the remaining region. Extensive experiments on ImageNet dataset for gray-box, black-box, and transferred attacks are performed and the results show that RegionSparse can eliminate up to 90% attacks delivered by strong attacks including Momentum Iterative Fast Gradient Sign Method and Carlini-Wagner's L 2 attack.
What problem does this paper attempt to address?