Abstract:AI-enabled collaborative robots are designed to be used in close collaboration with humans, thus requiring stringent safety standards and quick response times. Adversarial attacks pose a significant threat to the deep learning models of these systems, making it crucial to develop methods to improve the models' robustness against them. Adversarial training is one approach to improve their robustness: it works by augmenting the training data with adversarial examples. This, unfortunately, comes with the cost of increased computational overhead and extended training times. In this work, we balance the need for additional adversarial data with the goal of minimizing the training costs by selecting the most ‘valuable’ data for adversarial training. In particular, we propose a robustness-oriented boundary data selection method, RAST-AT, which stands for robust and fast adversarial training. RAST-AT selects training data near to the boundary by considering adversarial perturbations. Our method improves the speed of model training on CIFAR-10 by 68.67%, and compared to other data selection methods, has 10% higher accuracy with 10% training data selected, and 7% higher robustness with 4% training data selected. Our method also significantly improves efficiency by at least 25% in adversarial training, with the same performance. Finally, we evaluate our method on a cobot system, generating adversarial patches as attacks, and adopting RAST-AT as the defense. We find that RAST-AT can defend against 60% of untargeted attacks and 20% of targeted attacks. Our work highlights the benefits of developing effective defenses against adversarial attacks to ensure the security and reliability of AI-powered safety-critical systems.

Is your benchmark truly adversarial? AdvScore: Evaluating Human-Grounded Adversarialness

Boosting Adversarial Training in Safety-Critical Systems Through Boundary Data Selection

Measuring Adversarial Datasets

Robust Assessment of Real-World Adversarial Examples

Adversarial VQA: A New Benchmark for Evaluating the Robustness of VQA Models

ROBY: Evaluating the adversarial robustness of a deep model by its decision boundaries

How the Advent of Ubiquitous Large Language Models both Stymie and Turbocharge Dynamic Adversarial Question Generation

Beyond Score Changes: Adversarial Attack on No-Reference Image Quality Assessment from Two Perspectives

An Empirical Study of Accuracy, Fairness, Explainability, Distributional Robustness, and Adversarial Robustness

Adversarial GLUE: A Multi-Task Benchmark for Robustness Evaluation of Language Models

The Adversarial AI-Art: Understanding, Generation, Detection, and Benchmarking

How Real Is Real? A Human Evaluation Framework for Unrestricted Adversarial Examples

Suspiciousness of Adversarial Texts to Human

Enhancing Adversarial Robustness via Score-Based Optimization

Benchmarking Adversarial Robustness on Image Classification

A practical approach to evaluating the adversarial distance for machine learning classifiers

Asymmetric Bias in Text-to-Image Generation with Adversarial Attacks

Robust Testing of AI Language Model Resiliency with Novel Adversarial Prompts

Assessing Robustness via Score-Based Adversarial Image Generation

Adversarial Robustness Through Artifact Design

Adversarial Attack on Attackers: Post-Process to Mitigate Black-Box Score-Based Query Attacks