Abstract:The widespread adoption of deep learning across various industries has introduced substantial challenges, particularly in terms of model explainability and security. The inherent complexity of deep learning models, while contributing to their effectiveness, also renders them susceptible to adversarial attacks. Among these, backdoor attacks are especially concerning, as they involve surreptitiously embedding specific triggers within training data, causing the model to exhibit aberrant behavior when presented with input containing the triggers. Such attacks often exploit vulnerabilities in outsourced processes, compromising model integrity without affecting performance on clean (trigger-free) input data. In this paper, we present a comprehensive review of existing mitigation strategies designed to counter backdoor attacks in image recognition. We provide an in-depth analysis of the theoretical foundations, practical efficacy, and limitations of these approaches. In addition, we conduct an extensive benchmarking of sixteen state-of-the-art approaches against eight distinct backdoor attacks, utilizing three datasets, four model architectures, and three poisoning ratios. Our results, derived from 122,236 individual experiments, indicate that while many approaches provide some level of protection, their performance can vary considerably. Furthermore, when compared to two seminal approaches, most newer approaches do not demonstrate substantial improvements in overall performance or consistency across diverse settings. Drawing from these findings, we propose potential directions for developing more effective and generalizable defensive mechanisms in the future.

Deep Image Destruction: Vulnerability of Deep Image-to-Image Models against Adversarial Attacks

Understanding Object Detection Through An Adversarial Lens

Defense against adversarial attacks based on color space transformation

Evaluating Similitude and Robustness of Deep Image Denoising Models via Adversarial Attack

Vulnerability Analysis of Chest X-Ray Image Classification Against Adversarial Attacks

When Deep Fool Meets Deep Prior: Adversarial Attack on Super-Resolution Network.

Benchmarking Adversarial Robustness on Image Classification

Detecting Adversarial Examples Through Image Transformation

Undermining Image and Text Classification Algorithms Using Adversarial Attacks

Unrevealed Threats: A Comprehensive Study of the Adversarial Robustness of Underwater Image Enhancement Models

Review on Image Processing Based Adversarial Example Defenses in Computer Vision

Transferable Learned Image Compression-Resistant Adversarial Perturbations

Mitigating Adversarial Attacks for Deep Neural Networks by Input Deformation and Augmentation

A Review of Adversarial Attacks in Computer Vision

Detecting Adversarial Samples for Deep Learning Models: A Comparative Study

A Survey and Evaluation of Adversarial Attacks for Object Detection

Image Super-Resolution as a Defense Against Adversarial Attacks

Adversarial Attack and Defense in Deep Ranking

Countering Backdoor Attacks in Image Recognition: A Survey and Evaluation of Mitigation Strategies

How Resilient Are Deep Learning Models in Medical Image Analysis? The Case of the Moment-Based Adversarial Attack (Mb-AdA)