PFAttack: Stealthy Attack Bypassing Group Fairness in Federated Learning

Jiashi Gao,Ziwei Wang,Xiangyu Zhao,Xin Yao,Xuetao Wei
2024-10-09
Abstract:Federated learning (FL), integrating group fairness mechanisms, allows multiple clients to collaboratively train a global model that makes unbiased decisions for different populations grouped by sensitive attributes (e.g., gender and race). Due to its distributed nature, previous studies have demonstrated that FL systems are vulnerable to model poisoning attacks. However, these studies primarily focus on perturbing accuracy, leaving a critical question unexplored: Can an attacker bypass the group fairness mechanisms in FL and manipulate the global model to be biased? The motivations for such an attack vary; an attacker might seek higher accuracy, yet fairness considerations typically limit the accuracy of the global model or aim to cause ethical disruption. To address this question, we design a novel form of attack in FL, termed Profit-driven Fairness Attack (PFATTACK), which aims not to degrade global model accuracy but to bypass fairness mechanisms. Our fundamental insight is that group fairness seeks to weaken the dependence of outputs on input attributes related to sensitive information. In the proposed PFATTACK, an attacker can recover this dependence through local fine-tuning across various sensitive groups, thereby creating a biased yet accuracy-preserving malicious model and injecting it into FL through model replacement. Compared to attacks targeting accuracy, PFATTACK is more stealthy. The malicious model in PFATTACK exhibits subtle parameter variations relative to the original global model, making it robust against detection and filtering by Byzantine-resilient aggregations. Extensive experiments on benchmark datasets are conducted for four fair FL frameworks and three Byzantine-resilient aggregations against model poisoning, demonstrating the effectiveness and stealth of PFATTACK in bypassing group fairness mechanisms in FL.
Machine Learning
What problem does this paper attempt to address?
### What problems does this paper attempt to solve? This paper aims to explore and solve the following key problems: 1. **Can the group fairness mechanism in Federated Learning (FL) be bypassed?** - The paper proposes a new form of attack, namely **Profit - Driven Fairness Attack (PFAttack)**. The goal of this attack is not to reduce the accuracy of the global model, but to bypass the fairness mechanism in the federated learning system, making the global model biased. 2. **How to design and implement this attack?** - Researchers have proposed several methods to achieve this goal: - **Inverse - Debiasing (ID) Fine - tuning**: Restore the bias weakened by the fairness mechanism by fine - tuning the global model on local data. - **Aggregation Weight Estimation**: Estimate the latest aggregation weights to ensure that the malicious model can replace the global model more accurately. 3. **How effective are the existing defense mechanisms against this new type of attack?** - Researchers have evaluated the effectiveness and stealthiness of PFAttack under different fairness - enhanced federated learning frameworks (such as FairBatch, FairReg, FairFed, f - qFedAvg) and multiple Byzantine - fault - tolerant aggregation methods (such as trimmed mean, trimmed median, Krum). ### Background and Motivation - **Federated Learning (FL)** is a distributed machine - learning technique that allows multiple clients to collaboratively train a global model without sharing or centralizing data. To ensure that the model's decisions are unbiased towards different groups (such as gender, race, etc.), researchers have introduced group fairness mechanisms. - **Model - poisoning attacks** have been widely studied in FL systems, but these attacks mainly focus on reducing model accuracy, ignoring the potential threat to fairness. - The uniqueness of **PFAttack** lies in that it does not pursue reducing accuracy, but makes the model biased by restoring the model's dependence on sensitive attributes, thereby bypassing the fairness mechanism. ### Main Contributions - **Reveal the vulnerability of fairness - enhanced federated learning systems**, especially when facing targeted fairness attacks. - **Propose PFAttack**, a novel attack method that can bypass the fairness mechanism while maintaining or improving accuracy. - **Verify the effectiveness and stealthiness of PFAttack**, demonstrate its performance on multiple fairness - enhanced FL frameworks and benchmark datasets through experiments, and prove its robustness against Byzantine - fault - tolerant aggregation methods. ### Summary By introducing PFAttack, this paper reveals the potential loopholes in the fairness aspect of federated learning systems and provides a new direction for future security research.