BadSampler: Harnessing the Power of Catastrophic Forgetting to Poison Byzantine-robust Federated Learning

Yi Liu,Cong Wang,Xingliang Yuan
DOI: https://doi.org/10.1145/3637528.3671879
2024-06-18
Abstract:Federated Learning (FL) is susceptible to poisoning attacks, wherein compromised clients manipulate the global model by modifying local datasets or sending manipulated model updates. Experienced defenders can readily detect and mitigate the poisoning effects of malicious behaviors using Byzantine-robust aggregation rules. However, the exploration of poisoning attacks in scenarios where such behaviors are absent remains largely unexplored for Byzantine-robust FL. This paper addresses the challenging problem of poisoning Byzantine-robust FL by introducing catastrophic forgetting. To fill this gap, we first formally define generalization error and establish its connection to catastrophic forgetting, paving the way for the development of a clean-label data poisoning attack named BadSampler. This attack leverages only clean-label data (i.e., without poisoned data) to poison Byzantine-robust FL and requires the adversary to selectively sample training data with high loss to feed model training and maximize the model's generalization error. We formulate the attack as an optimization problem and present two elegant adversarial sampling strategies, Top-$\kappa$ sampling, and meta-sampling, to approximately solve it. Additionally, our formal error upper bound and time complexity analysis demonstrate that our design can preserve attack utility with high efficiency. Extensive evaluations on two real-world datasets illustrate the effectiveness and performance of our proposed attacks.
Cryptography and Security,Artificial Intelligence,Machine Learning
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to carry out data poisoning attacks (DPA) in Byzantine - robust federated learning (FL). Specifically, the paper focuses on how to successfully carry out poisoning attacks on Byzantine - robust federated learning systems without unrealistic assumptions. Byzantine - robust federated learning systems resist the influence of malicious clients by using robust aggregation rules, which makes traditional data poisoning attacks difficult to be effective. Therefore, this paper proposes a new method, BadSampler, which utilizes the catastrophic forgetting mechanism to achieve effective attacks on Byzantine - robust federated learning systems. ### Main contributions of the paper 1. **Designed BadSampler**: A clean - label data poisoning attack method for Byzantine - robust federated learning systems. Different from existing attacks, BadSampler only uses clean - label data and induces catastrophic forgetting of the model by changing the sampling order of local data, thereby reducing the generalization ability of the model. 2. **Proposed two optimized adversarial sampling strategies**: - **Top - đťś… sampling strategy**: Select samples with higher losses as the candidate pool and extract difficult samples from it for training to maximize the generalization error of the model. - **Meta - sampling strategy**: Utilize the concept of meta - learning and guide sampling by adjusting the histograms of training and validation error distributions to further improve the effectiveness of the attack. 3. **Conducted extensive experimental evaluations**: Conducted experiments on convex and non - convex models on two public datasets and compared with existing defense methods, demonstrating the effectiveness of BadSampler. Experimental results show that BadSampler can significantly reduce the model accuracy of advanced defense methods such as FLTrust. ### Core ideas of the paper - **Utilize catastrophic forgetting**: Through carefully designed adversarial sampling strategies, make the model gradually forget the previously learned knowledge during the training process, thereby reducing its generalization ability. - **Maintain good training behavior**: While inducing catastrophic forgetting, maintain a low training error to avoid being detected by advanced defense mechanisms. - **Adapt to the dynamic training process**: Clients in federated learning are dynamically changing, and BadSampler can adapt to this dynamic nature and perform effective adversarial sampling through different compromised clients. ### Technical details - **Design of sampling strategies**: - **Top - đťś… sampling strategy**: Select the top đťś… samples with the highest losses as the candidate pool and extract samples from it for training. - **Meta - sampling strategy**: Use the histogram distributions of training and validation errors as meta - states and determine the sampling weight of each sample through the Gaussian function. - **Definition of optimization problems**: - The objective is to maximize the generalization error while minimizing the training error to avoid being filtered out by the defense mechanism. - The formal representation of the optimization problem is: \[ \max_{B'_1, B'_2, \ldots, B'_m} \text{Err}, \] \[ \text{subject to } B' = A(B'_1, B'_2, \ldots, B'_m), \] \[ (x, y) \in D_m, \] \[ \min_{(x, y) \in D_m} E^m_{D_t}. \] - **Theoretical analysis**: - Through theoretical analysis, it is proved that the error upper bound of BadSampler is related to the internal FL parameters and is efficient within the actual parameter range. ### Conclusion This paper successfully realizes data poisoning attacks on Byzantine - robust federated learning systems by introducing the catastrophic forgetting mechanism and designing a new adversarial sampling strategy - BadSampler. Experimental results show that BadSampler can significantly reduce the generalization ability of the model while maintaining good training behavior and has a strong attack effect on existing defense methods.