Abstract:Machine learning models -- deep neural networks in particular -- have performed remarkably well on benchmark datasets across a wide variety of domains. However, the ease of finding adversarial counter-examples remains a persistent problem when training times are measured in hours or days and the time needed to find a successful adversarial counter-example is measured in seconds. Much work has gone into generating and defending against these adversarial counter-examples, however the relative costs of attacks and defences are rarely discussed. Additionally, machine learning research is almost entirely guided by test/train metrics, but these would require billions of samples to meet industry standards. The present work addresses the problem of understanding and predicting how particular model hyper-parameters influence the performance of a model in the presence of an adversary. The proposed approach uses survival models, worst-case examples, and a cost-aware analysis to precisely and accurately reject a particular model change during routine model training procedures rather than relying on real-world deployment, expensive formal verification methods, or accurate simulations of very complicated systems (\textit{e.g.}, digitally recreating every part of a car or a plane). Through an evaluation of many pre-processing techniques, adversarial counter-examples, and neural network configurations, the conclusion is that deeper models do offer marginal gains in survival times compared to more shallow counterparts. However, we show that those gains are driven more by the model inference time than inherent robustness properties. Using the proposed methodology, we show that ResNet is hopelessly insecure against even the simplest of white box attacks.

What problem does this paper attempt to address?

### Problems the Paper Attempts to Solve The main purpose of this paper is to evaluate whether survival analysis can predict the success of a specific set of model hyperparameters and to explore the relationship between computational cost and prediction accuracy in both benign and adversarial environments. Specifically, by using specially designed challenging samples and survival models, the paper provides a framework to predict the expected failure time across the entire adversarial space. ### Specific Objectives 1. **Application of Survival Analysis Models**: - Use survival analysis models to analyze the performance of machine learning models under adversarial perturbations and provide substantial empirical evidence that survival analysis is both effective and dataset-independent, thereby allowing for more precise and accurate prediction of expected failure rates. 2. **Measurement of Model Robustness**: - Explore the relationship between latency, accuracy, and model depth through extensive signal preprocessing techniques and measure model robustness using survival analysis models. 3. **Proposal of New Metrics**: - Propose a new metric—Training Rate and Survival Heuristic (TRASH)—to evaluate the robustness of models against adversarial attacks under time and computational constraints. 4. **Empirical Evidence**: - Provide substantial empirical evidence showing that larger neural networks, while having a slight advantage in survival time over smaller models, owe this advantage mainly to model inference time rather than inherent robustness characteristics. Through these studies, the paper demonstrates that even the simplest white-box attacks render ResNet hopeless in terms of security. Additionally, the paper explores how survival analysis methods can quickly eliminate ineffective strategies and proposes a simple yet effective cost-benefit metric.

A Training Rate and Survival Heuristic for Inference and Robustness Evaluation (TRASHFIRE)

Boosting Adversarial Training in Safety-Critical Systems Through Boundary Data Selection

A Cost-Aware Approach to Adversarial Robustness in Neural Networks

Selecting Models based on the Risk of Damage Caused by Adversarial Attacks

A Model for Estimating Resiliency of AI-Based Classifiers Defending Against Cyber Attacks

Certifiers Make Neural Networks Vulnerable to Availability Attacks

Towards Precise Observations of Neural Model Robustness in Classification

Defense-Resistant Backdoor Attacks Against Deep Neural Networks in Outsourced Cloud Environment

How to Train your Antivirus: RL-based Hardening through the Problem-Space

Towards A Critical Evaluation of Robustness for Deep Learning Backdoor Countermeasures

Impact of Architectural Modifications on Deep Learning Adversarial Robustness

HASI: Hardware-Accelerated Stochastic Inference, A Defense Against Adversarial Machine Learning Attacks

Isolation and Induction: Training Robust Deep Neural Networks against Model Stealing Attacks

Efficient Model Stealing Defense with Noise Transition Matrix

Model-Reuse Attacks on Deep Learning Systems

Trainwreck: A damaging adversarial attack on image classifiers

Understanding and Enhancing Robustness of Concept-Based Models

Hijacking Attacks against Neural Networks by Analyzing Training Data

JumpReLU: A Retrofit Defense Strategy for Adversarial Attacks

TamperNN: Efficient Tampering Detection of Deployed Neural Nets

Pruning in the Face of Adversaries