Abstract:Adversarial attacks on deep-learning models pose a serious threat to their reliability and security. Existing defense mechanisms are narrow addressing a specific type of attack or being vulnerable to sophisticated attacks. We propose a new defense mechanism that, while being focused on image-based classifiers, is general with respect to the cited category. It is rooted on hyperspace projection. In particular, our solution provides a pseudo-random projection of the original dataset into a new dataset. The proposed defense mechanism creates a set of diverse projected datasets, where each projected dataset is used to train a specific classifier, resulting in different trained classifiers with different decision boundaries. During testing, it randomly selects a classifier to test the input. Our approach does not sacrifice accuracy over legitimate input. Other than detailing and providing a thorough characterization of our defense mechanism, we also provide a proof of concept of using four optimization-based adversarial attacks (PGD, FGSM, IGSM, and C\&W) and a generative adversarial attack testing them on the MNIST dataset. Our experimental results show that our solution increases the robustness of deep learning models against adversarial attacks and significantly reduces the attack success rate by at least 89% for optimization attacks and 78% for generative attacks. We also analyze the relationship between the number of used hyperspaces and the efficacy of the defense mechanism. As expected, the two are positively correlated, offering an easy-to-tune parameter to enforce the desired level of security. The generality and scalability of our solution and adaptability to different attack scenarios, combined with the excellent achieved results, other than providing a robust defense against adversarial attacks on deep learning networks, also lay the groundwork for future research in the field.

Adversarial Defense Via Self-Orthogonal Randomization Super-Network.

Attack As Defense: Characterizing Adversarial Examples Using Robustness.

Dynamic Defense Approach for Adversarial Robustness in Deep Neural Networks via Stochastic Ensemble Smoothed Model

Ensemble Methods as a Defense to Adversarial Perturbations Against Deep Neural Networks

Adversarial Example Defenses: Ensembles of Weak Defenses are not Strong

A Universal Defense Strategy Against Adversarial Attacks Based on Attention-Guided

Improving Adversarial Robustness via Promoting Ensemble Diversity.

Adversarial Attacks Neutralization via Data Set Randomization

Improving Model Robustness Against Adversarial Examples with Redundant Fully Connected Layer.

Adversarial Robust Decision-Making under Uncertainty Learning and Dynamic Ensemble Selection

Improving Adversarial Robustness Via Promoting Ensemble Diversity

Towards robust neural networks via orthogonal diversity

An Empirical Investigation of Randomized Defenses against Adversarial Attacks

Self-ensemble Adversarial Training for Improved Robustness

Ensemble-in-One: Ensemble Learning Within Random Gated Networks for Enhanced Adversarial Robustness.

Understanding the Robustness of Randomized Feature Defense Against Query-Based Adversarial Attacks

Orthogonal Deep Models As Defense Against Black-Box Attacks

Dynamic ensemble selection based on Deep Neural Network Uncertainty Estimation for Adversarial Robustness

EnResNet: ResNets Ensemble Via the Feynman-Kac Formalism for Adversarial Defense and Beyond

Synergy-of-Experts: Collaborate to Improve Adversarial Robustness