Abstract:As a distributed machine learning paradigm, Federated Learning (FL) enables large-scale clients to collaboratively train a model without sharing their raw data. However, due to the lack of data auditing for untrusted clients, FL is vulnerable to poisoning attacks, especially backdoor attacks. By using poisoned data for local training or directly changing the model parameters, attackers can easily inject backdoors into the model, which can trigger the model to make misclassification of targeted patterns in images. To address these issues, we propose a novel data-free trigger-generation-based defense approach based on the two characteristics of backdoor attacks: i) triggers are learned faster than normal knowledge, and ii) trigger patterns have a greater effect on image classification than normal class patterns. Our approach generates the images with newly learned knowledge by identifying the differences between the old and new global models, and filters trigger images by evaluating the effect of these generated images. By using these trigger images, our approach eliminates poisoned models to ensure the updated global model is benign. Comprehensive experiments demonstrate that our approach can defend against almost all the existing types of backdoor attacks and outperform all the seven state-of-the-art defense methods with both IID and non-IID scenarios. Especially, our approach can successfully defend against the backdoor attack even when 80\% of the clients are malicious.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is the backdoor attack problem in Federated Learning (FL). Specifically, as a distributed machine - learning paradigm, Federated Learning allows multiple clients to collaboratively train models without sharing the original data. However, due to the lack of data auditing for untrusted clients, Federated Learning is vulnerable to poisoning attacks, especially backdoor attacks. ### Characteristics of backdoor attacks: 1. **Stealthiness**: Backdoor attacks do not reduce the inference accuracy of the model on clean data, so they are highly stealthy. 2. **Targetedness**: By embedding a specific pattern (i.e., a trigger) in the input, an attacker can control the model to misclassify certain inputs into the target category. 3. **Widespread impact**: If a large number of malicious clients participate in model updates, these attacks may gradually be injected into the global model. ### Solutions proposed in the paper: To solve the above problems, the authors propose a defense method based on data - free trigger generation. The core idea of this method is to utilize two characteristics: - **Triggers are learned more quickly**: Compared with normal knowledge, triggers are more easily and quickly learned by the model. - **Trigger patterns have a greater impact on classification**: Trigger patterns have a greater impact on image classification than normal category patterns. ### Method overview: 1. **Knowledge extraction**: Use Conditional Generative Adversarial Networks (CGAN) to generate images containing newly learned knowledge. These images can distinguish the differences between the old global model and the new aggregated global model. 2. **Trigger filtering**: Screen out those trigger images that can significantly change the classification results from the generated images to avoid interference from other normal knowledge. 3. **Model filtering**: Use the generated trigger images to detect and eliminate poisoned local models to ensure that the aggregated global model is benign. ### Main contributions: - Propose a new defense method that does not require an additional data set or prior knowledge. - Design knowledge extraction and trigger filtering mechanisms, which improve the generalization and reliability of the method in cases of extreme data distribution heterogeneity. - Experiments show that this method can effectively defend against almost all existing types of backdoor attacks in different data distribution scenarios and is superior to seven of the latest defense methods. Through this method, the paper effectively solves the problem of backdoor attacks in Federated Learning and ensures the accuracy and security of the model.

Protect Federated Learning Against Backdoor Attacks via Data-Free Trigger Generation

Backdoor Attacks and Defenses in Federated Learning: State-of-the-Art, Taxonomy, and Future Directions

Concealing Backdoor Model Updates in Federated Learning by Trigger-Optimized Data Poisoning

On the Vulnerability of Backdoor Defenses for Federated Learning

Coordinated Backdoor Attacks against Federated Learning with Model-Dependent Triggers

FTA: Stealthy and Adaptive Backdoor Attack with Flexible Triggers on Federated Learning

Mitigating Backdoors in Federated Learning with FLD

How To Backdoor Federated Learning

Federated Learning Backdoor Attack Based on Frequency Domain Injection

Towards Practical Backdoor Attacks on Federated Learning Systems

Non-Cooperative Backdoor Attacks in Federated Learning: A New Threat Landscape

FLARE: A Backdoor Attack to Federated Learning with Refined Evasion

A Robust Defense Algorithm for Backdoor Erasure Based on Attention Alignment in Federated Learning

Practical and General Backdoor Attacks against Vertical Federated Learning

FLSAD: Defending Backdoor Attacks in Federated Learning via Self-Attention Distillation

FLMAAcBD: Defending against backdoors in Federated Learning via Model Anomalous Activation Behavior Detection

Mitigating Backdoor Attacks in Federated Learning via Flipping Weight Updates of Low-Activation Input Neurons

Backdoor Federated Learning by Poisoning Backdoor-Critical Layers

Act in Collusion: A Persistent Distributed Multi-Target Backdoor in Federated Learning

Dual Model Replacement:invisible Multi-target Backdoor Attack based on Federal Learning

Securing Federated Learning Against Novel and Classic Backdoor Threats During Foundation Model Integration