Understanding Deep Learning defenses Against Adversarial Examples Through Visualizations for Dynamic Risk Assessment
Xabier Echeberria-Barrio, Amaia Gil-Lerchundi, Jon Egana-Zubia, Raul Orduna-Urrutia
DOI: https://doi.org/10.1007/s00521-021-06812-y
2024-02-13
Abstract:In recent years, Deep Neural Network models have been developed in different
fields, where they have brought many advances. However, they have also started
to be used in tasks where risk is critical. A misdiagnosis of these models can
lead to serious accidents or even death. This concern has led to an interest
among researchers to study possible attacks on these models, discovering a long
list of vulnerabilities, from which every model should be defended. The
adversarial example attack is a widely known attack among researchers, who have
developed several defenses to avoid such a threat. However, these defenses are
as opaque as a deep neural network model, how they work is still unknown. This
is why visualizing how they change the behavior of the target model is
interesting in order to understand more precisely how the performance of the
defended model is being modified. For this work, some defenses, against
adversarial example attack, have been selected in order to visualize the
behavior modification of each of them in the defended model. Adversarial
training, dimensionality reduction and prediction similarity were the selected
defenses, which have been developed using a model composed by convolution
neural network layers and dense neural network layers. In each defense, the
behavior of the original model has been compared with the behavior of the
defended model, representing the target model by a graph in a visualization.
Machine Learning,Cryptography and Security