Visual Analytics of Neuron Vulnerability to Adversarial Attacks on Convolutional Neural Networks

Yiran Li,Junpeng Wang,Takanori Fujiwara,Kwan-Liu Ma
DOI: https://doi.org/10.1145/3587470
2023-03-06
Abstract:Adversarial attacks on a convolutional neural network (CNN) -- injecting human-imperceptible perturbations into an input image -- could fool a high-performance CNN into making incorrect predictions. The success of adversarial attacks raises serious concerns about the robustness of CNNs, and prevents them from being used in safety-critical applications, such as medical diagnosis and autonomous driving. Our work introduces a visual analytics approach to understanding adversarial attacks by answering two questions: (1) which neurons are more vulnerable to attacks and (2) which image features do these vulnerable neurons capture during the prediction? For the first question, we introduce multiple perturbation-based measures to break down the attacking magnitude into individual CNN neurons and rank the neurons by their vulnerability levels. For the second, we identify image features (e.g., cat ears) that highly stimulate a user-selected neuron to augment and validate the neuron's responsibility. Furthermore, we support an interactive exploration of a large number of neurons by aiding with hierarchical clustering based on the neurons' roles in the prediction. To this end, a visual analytics system is designed to incorporate visual reasoning for interpreting adversarial attacks. We validate the effectiveness of our system through multiple case studies as well as feedback from domain experts.
Computer Vision and Pattern Recognition,Cryptography and Security,Human-Computer Interaction,Machine Learning
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is the vulnerability of convolutional neural networks (CNNs) when facing adversarial attacks. Specifically, the paper focuses on how to understand and interpret the impact of adversarial attacks on CNNs, especially which neurons are more vulnerable to attacks and what image features these vulnerable neurons capture during the prediction process. By solving these problems, the paper aims to improve the reliability and robustness of CNNs in applications with high security requirements, such as medical diagnosis and autonomous driving systems. To achieve this goal, the paper proposes a visual analysis method that can: 1. **Identify more vulnerable neurons**: By introducing multiple perturbation - based metrics, decompose the attack intensity onto individual CNN neurons and rank the neurons according to their vulnerability levels. 2. **Understand the image features captured by vulnerable neurons**: Utilize receptive fields (RFs) to reveal the regions in the input image that can significantly stimulate each neuron, and visually compare these RFs before and after a set of image perturbations. In addition, in order to efficiently explore a large number of neurons and interpret their semantics, the paper also supports hierarchical clustering based on the similarity of the roles of neurons in prediction. Through these methods, the paper designs an integrated visual analysis system that supports flexible exploration across multiple input images and neurons, thereby helping domain experts obtain actionable insights from the interpretation of adversarial attacks.