Abstract:Deep learning (DL) models are inherently vulnerable to adversarial examples - maliciously crafted inputs to trigger target DL models to misbehave - which significantly hinders the application of DL in security-sensitive domains. Intensive research on adversarial learning has led to an arms race between adversaries and defenders. Such plethora of emerging attacks and defenses raise many questions: Which attacks are more evasive, preprocessing-proof, or transferable? Which defenses are more effective, utility-preserving, or general? Are ensembles of multiple defenses more robust than individuals? Yet, due to the lack of platforms for comprehensive evaluation on adversarial attacks and defenses, these critical questions remain largely unsolved. In this paper, we present the design, implementation, and evaluation of DEEPSEC, a uniform platform that aims to bridge this gap. In its current implementation, DEEPSEC incorporates 16 state-of-the-art attacks with 10 attack utility metrics, and 13 state-of-the-art defenses with 5 defensive utility metrics. To our best knowledge, DEEPSEC is the first platform that enables researchers and practitioners to (i) measure the vulnerability of DL models, (ii) evaluate the effectiveness of various attacks/defenses, and (iii) conduct comparative studies on attacks/defenses in a comprehensive and informative manner. Leveraging DEEPSEC, we systematically evaluate the existing adversarial attack and defense methods, and draw a set of key findings, which demonstrate DEEPSEC's rich functionality, such as (1) the trade-off between misclassification and imperceptibility is empirically confirmed; (2) most defenses that claim to be universally applicable can only defend against limited types of attacks under restricted settings; (3) it is not necessary that adversarial examples with higher perturbation magnitude are easier to be detected; (4) the ensemble of multiple defenses cannot improve the overall defense capability, but can improve the lower bound of the defense effectiveness of individuals. Extensive analysis on DEEPSEC demonstrates its capabilities and advantages as a benchmark platform which can benefit future adversarial learning research.

MetaA: Multi-Dimensional Evaluation of Testing Ability Via Adversarial Examples in Deep Learning

Attack As Defense: Characterizing Adversarial Examples Using Robustness.

Yet Meta Learning Can Adapt Fast, It Can Also Break Easily

Enhancing adversarial robustness for deep metric learning via neural discrete adversarial training

Towards characterizing adversarial defects of deep learning software from the lens of uncertainty

MEAD: A Multi-Armed Approach for Evaluation of Adversarial Examples Detectors

MAD: Meta Adversarial Defense Benchmark

META: Multidimensional Evaluation of Testing Ability

Generating Adversarial Examples for Holding Robustness of Source Code Processing Models

Meta Gradient Adversarial Attack

Deep Adversarial Metric Learning

Boosting Black-Box Adversarial Attacks with Meta Learning

Unveiling the Achilles' Heel of NLG Evaluators: A Unified Adversarial Framework Driven by Large Language Models

METAL: Metamorphic Testing Framework for Analyzing Large-Language Model Qualities

Generating Adversarial Examples with an Optimized Quality

DeepFeature: Guiding adversarial testing for deep neural network systems using robust features

DEEPSEC: A Uniform Platform for Security Analysis of Deep Learning Model.

Testing Deep Learning Models: A First Comparative Study of Multiple Testing Techniques

Hierarchical Distribution-Aware Testing of Deep Learning

Adversarial Examples: Opportunities and Challenges