Abstract:AI-enabled collaborative robots are designed to be used in close collaboration with humans, thus requiring stringent safety standards and quick response times. Adversarial attacks pose a significant threat to the deep learning models of these systems, making it crucial to develop methods to improve the models' robustness against them. Adversarial training is one approach to improve their robustness: it works by augmenting the training data with adversarial examples. This, unfortunately, comes with the cost of increased computational overhead and extended training times. In this work, we balance the need for additional adversarial data with the goal of minimizing the training costs by selecting the most ‘valuable’ data for adversarial training. In particular, we propose a robustness-oriented boundary data selection method, RAST-AT, which stands for robust and fast adversarial training. RAST-AT selects training data near to the boundary by considering adversarial perturbations. Our method improves the speed of model training on CIFAR-10 by 68.67%, and compared to other data selection methods, has 10% higher accuracy with 10% training data selected, and 7% higher robustness with 4% training data selected. Our method also significantly improves efficiency by at least 25% in adversarial training, with the same performance. Finally, we evaluate our method on a cobot system, generating adversarial patches as attacks, and adopting RAST-AT as the defense. We find that RAST-AT can defend against 60% of untargeted attacks and 20% of targeted attacks. Our work highlights the benefits of developing effective defenses against adversarial attacks to ensure the security and reliability of AI-powered safety-critical systems.

Parameter-constrained adversarial training

Bag of Tricks for FGSM Adversarial Training

Boosting Adversarial Training in Safety-Critical Systems Through Boundary Data Selection

GAAT: Group Adaptive Adversarial Training to Improve the Trade-Off Between Robustness and Accuracy

Fast Adversarial Training with Adaptive Step Size

Robust Single-step Adversarial Training with Regularizer

Initializing Perturbations in Multiple Directions for Fast Adversarial Training

Towards Rapid and Robust Adversarial Training with One-Step Attacks

Preventing Catastrophic Overfitting in Fast Adversarial Training: A Bi-level Optimization Perspective

PIAT: Parameter Interpolation based Adversarial Training for Image Classification

Revisiting single-step adversarial training for robustness and generalization

Towards Understanding Fast Adversarial Training

Towards sustainable adversarial training with successive perturbation generation

Regional Adversarial Training for Better Robust Generalization

Adversarial Training: embedding adversarial perturbations into the parameter space of a neural network to build a robust system

Efficient Two-Step Adversarial Defense for Deep Neural Networks

Understanding Catastrophic Overfitting in Single-step Adversarial Training

Prior-Guided Adversarial Initialization for Fast Adversarial Training

Improving Fast Adversarial Training with Prior-Guided Knowledge

Eliminating Catastrophic Overfitting Via Abnormal Adversarial Examples Regularization

Towards improving fast adversarial training in multi-exit network