Abstract:Federated learning (FL), integrating group fairness mechanisms, allows multiple clients to collaboratively train a global model that makes unbiased decisions for different populations grouped by sensitive attributes (e.g., gender and race). Due to its distributed nature, previous studies have demonstrated that FL systems are vulnerable to model poisoning attacks. However, these studies primarily focus on perturbing accuracy, leaving a critical question unexplored: Can an attacker bypass the group fairness mechanisms in FL and manipulate the global model to be biased? The motivations for such an attack vary; an attacker might seek higher accuracy, yet fairness considerations typically limit the accuracy of the global model or aim to cause ethical disruption. To address this question, we design a novel form of attack in FL, termed Profit-driven Fairness Attack (PFATTACK), which aims not to degrade global model accuracy but to bypass fairness mechanisms. Our fundamental insight is that group fairness seeks to weaken the dependence of outputs on input attributes related to sensitive information. In the proposed PFATTACK, an attacker can recover this dependence through local fine-tuning across various sensitive groups, thereby creating a biased yet accuracy-preserving malicious model and injecting it into FL through model replacement. Compared to attacks targeting accuracy, PFATTACK is more stealthy. The malicious model in PFATTACK exhibits subtle parameter variations relative to the original global model, making it robust against detection and filtering by Byzantine-resilient aggregations. Extensive experiments on benchmark datasets are conducted for four fair FL frameworks and three Byzantine-resilient aggregations against model poisoning, demonstrating the effectiveness and stealth of PFATTACK in bypassing group fairness mechanisms in FL.

What problem does this paper attempt to address?

### What problems does this paper attempt to solve? This paper aims to explore and solve the following key problems: 1. **Can the group fairness mechanism in Federated Learning (FL) be bypassed?** - The paper proposes a new form of attack, namely **Profit - Driven Fairness Attack (PFAttack)**. The goal of this attack is not to reduce the accuracy of the global model, but to bypass the fairness mechanism in the federated learning system, making the global model biased. 2. **How to design and implement this attack?** - Researchers have proposed several methods to achieve this goal: - **Inverse - Debiasing (ID) Fine - tuning**: Restore the bias weakened by the fairness mechanism by fine - tuning the global model on local data. - **Aggregation Weight Estimation**: Estimate the latest aggregation weights to ensure that the malicious model can replace the global model more accurately. 3. **How effective are the existing defense mechanisms against this new type of attack?** - Researchers have evaluated the effectiveness and stealthiness of PFAttack under different fairness - enhanced federated learning frameworks (such as FairBatch, FairReg, FairFed, f - qFedAvg) and multiple Byzantine - fault - tolerant aggregation methods (such as trimmed mean, trimmed median, Krum). ### Background and Motivation - **Federated Learning (FL)** is a distributed machine - learning technique that allows multiple clients to collaboratively train a global model without sharing or centralizing data. To ensure that the model's decisions are unbiased towards different groups (such as gender, race, etc.), researchers have introduced group fairness mechanisms. - **Model - poisoning attacks** have been widely studied in FL systems, but these attacks mainly focus on reducing model accuracy, ignoring the potential threat to fairness. - The uniqueness of **PFAttack** lies in that it does not pursue reducing accuracy, but makes the model biased by restoring the model's dependence on sensitive attributes, thereby bypassing the fairness mechanism. ### Main Contributions - **Reveal the vulnerability of fairness - enhanced federated learning systems**, especially when facing targeted fairness attacks. - **Propose PFAttack**, a novel attack method that can bypass the fairness mechanism while maintaining or improving accuracy. - **Verify the effectiveness and stealthiness of PFAttack**, demonstrate its performance on multiple fairness - enhanced FL frameworks and benchmark datasets through experiments, and prove its robustness against Byzantine - fault - tolerant aggregation methods. ### Summary By introducing PFAttack, this paper reveals the potential loopholes in the fairness aspect of federated learning systems and provides a new direction for future security research.

PFAttack: Stealthy Attack Bypassing Group Fairness in Federated Learning

Attacks on fairness in Federated Learning

EAB-FL: Exacerbating Algorithmic Bias through Model Poisoning Attacks in Federated Learning

A federated learning attack method based on edge collaboration via cloud

DPFLA: Defending Private Federated Learning Against Poisoning Attacks

FedVal: Different good or different bad in federated learning

Denial-of-Service or Fine-Grained Control: Towards Flexible Model Poisoning Attacks on Federated Learning

A Learning-Based Attack Framework to Break SOTA Poisoning Defenses in Federated Learning

FL-PTD: A Privacy Preserving Defense Strategy Against Poisoning Attacks in Federated Learning.

BapFL : You can Backdoor Personalized Federated Learning

Advancing Hybrid Defense for Byzantine Attacks in Federated Learning

Lurking in the shadows: Unveiling Stealthy Backdoor Attacks against Personalized Federated Learning

FLTracer: Accurate Poisoning Attack Provenance in Federated Learning

Robust Federated Learning against both Data Heterogeneity and Poisoning Attack via Aggregation Optimization

BadFair: Backdoored Fairness Attacks with Group-conditioned Triggers

FLIP: A Provable Defense Framework for Backdoor Mitigation in Federated Learning

Data Poisoning Attacks Against Federated Learning Systems

FL-WBC: Enhancing Robustness against Model Poisoning Attacks in Federated Learning from a Client Perspective

Practical Attribute Reconstruction Attack Against Federated Learning

Privacy and Robustness in Federated Learning: Attacks and Defenses

PROFL: A Privacy-Preserving Federated Learning Method with Stringent Defense Against Poisoning Attacks