Abstract:Deep regression models are used in a wide variety of safety-critical applications, but are vulnerable to backdoor attacks. Although many defenses have been proposed for classification models, they are ineffective as they do not consider the uniqueness of regression models. First, the outputs of regression models are continuous values instead of discretized labels. Thus, the potential infected target of a backdoored regression model has infinite possibilities, which makes it impossible to be determined by existing defenses. Second, the backdoor behavior of backdoored deep regression models is triggered by the activation values of all the neurons in the feature space, which makes it difficult to be detected and mitigated using existing defenses. To resolve these problems, we propose DRMGuard, the first defense to identify if a deep regression model in the image domain is backdoored or not. DRMGuard formulates the optimization problem for reverse engineering based on the unique output-space and feature-space characteristics of backdoored deep regression models. We conduct extensive evaluations on two regression tasks and four datasets. The results show that DRMGuard can consistently defend against various backdoor attacks. We also generalize four state-of-the-art defenses designed for classifiers to regression models, and compare DRMGuard with them. The results show that DRMGuard significantly outperforms all those defenses.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is the vulnerability of deep regression models (DRMs) when facing backdoor attacks. Although many defense methods for classification models have been proposed, these methods are not applicable to regression models because regression models have their unique characteristics: 1. **Continuous output space**: Unlike classification models, the output of regression models is continuous values rather than discrete labels. Therefore, the potentially infected targets of a backdoor - injected regression model have infinite possibilities, which makes existing defense methods unable to determine the infected targets. 2. **Triggering behavior in the feature space**: Backdoor behavior in regression models is triggered by the activation values of all neurons in the feature space, which makes it difficult to detect and mitigate this behavior using existing methods. To address these problems, the authors propose DRMGuard - the first framework for identifying whether a deep regression model in the image domain has been injected with a backdoor. DRMGuard determines whether the model has been injected with a backdoor by reverse - engineering the potential trigger function and constructs an optimization problem based on the unique output space and feature space characteristics of regression models. ### Specific problem summary - **Challenge of continuous output space**: Since the output of regression models is continuous, it is not possible to enumerate and analyze all possible target vectors as in classification models. - **Triggering mechanism in the feature space**: The backdoor behavior of regression models is determined by the activation values of all neurons together, rather than a few specific neurons. ### Solution - **DRMGuard framework**: Reverse - engineer the potential trigger function and use optimization problems to search for possible backdoor trigger patterns. - **Construction of optimization problem**: Combine the characteristics of the output space and the feature space to construct an optimization problem in order to achieve reverse - engineering of the backdoor trigger function. ### Experimental verification The authors conducted extensive experiments on two regression tasks (such as gaze estimation and head pose estimation) and four datasets. The results show that DRMGuard can effectively defend against various backdoor attacks and is significantly superior to other existing methods. ### Conclusion This research solves the vulnerability problem of deep regression models when facing backdoor attacks, proposes a novel and effective defense framework DRMGuard, and fills this gap in this field.

Defending Deep Regression Models against Backdoor Attacks

Redeem Myself: Purifying Backdoors in Deep Learning Models Using Self Attention Distillation.

Regula Sub-rosa: Latent Backdoor Attacks on Deep Neural Networks

Breaking the False Sense of Security in Backdoor Defense through Re-Activation Attack

Towards Practical Deployment-Stage Backdoor Attack on Deep Neural Networks

B3: Backdoor Attacks Against Black-box Machine Learning Models

Towards A Critical Evaluation of Robustness for Deep Learning Backdoor Countermeasures

BaDExpert: Extracting Backdoor Functionality for Accurate Backdoor Input Detection

Escaping Backdoor Attack Detection of Deep Learning

Defense-Resistant Backdoor Attacks Against Deep Neural Networks in Outsourced Cloud Environment

Reflection Backdoor: A Natural Backdoor Attack on Deep Neural Networks

On Model Outsourcing Adaptive Attacks to Deep Learning Backdoor Defenses

Confidence Matters: Inspecting Backdoors in Deep Neural Networks via Distribution Transfer

DeepDefense: Training Deep Neural Networks with Improved Robustness.

Reverse Backdoor Distillation: Towards Online Backdoor Attack Detection for Deep Neural Network Models

Universal Soldier: Using Universal Adversarial Perturbations for Detecting Backdoor Attacks

Expose Before You Defend: Unifying and Enhancing Backdoor Defenses via Exposed Models

Countering Backdoor Attacks in Image Recognition: A Survey and Evaluation of Mitigation Strategies

DLP: towards active defense against backdoor attacks with decoupled learning process

Universal Post-Training Reverse-Engineering Defense Against Backdoors in Deep Neural Networks

DeCE: Deceptive Cross-Entropy Loss Designed for Defending Backdoor Attacks