Attacks on fairness in Federated Learning

Joseph Rance,Filip Svoboda
2024-07-26
Abstract:Federated Learning is an important emerging distributed training paradigm that keeps data private on clients. It is now well understood that by controlling only a small subset of FL clients, it is possible to introduce a backdoor to a federated learning model, in the presence of certain attributes. In this paper, we present a new type of attack that compromises the fairness of the trained model. Fairness is understood to be the attribute-level performance distribution of a trained model. It is particularly salient in domains where, for example, skewed accuracy discrimination between subpopulations could have disastrous consequences. We find that by employing a threat model similar to that of a backdoor attack, an attacker is able to influence the aggregated model to have an unfair performance distribution between any given set of attributes. Furthermore, we find that this attack is possible by controlling only a single client. While combating naturally induced unfairness in FL has previously been discussed in depth, its artificially induced kind has been neglected. We show that defending against attacks on fairness should be a critical consideration in any situation where unfairness in a trained model could benefit a user who participated in its training.
Machine Learning,Cryptography and Security
What problem does this paper attempt to address?
This paper aims to address the issue of model fairness being attacked in Federated Learning (FL). Specifically, the paper proposes a novel attack method that can cause the trained model to exhibit an unfair distribution of performance across different attributes. Unlike traditional backdoor attacks, this attack is not intended to add new functionalities to the model. Instead, it influences the overall performance distribution of the model by controlling a small portion of clients, making the performance of data with specific attributes superior to that of data with other attributes. This unfairness can lead to significantly higher prediction accuracy for certain subgroups compared to others, resulting in severe consequences for real-world applications, especially in scenarios requiring high fairness. The authors validate the effectiveness of this attack through experiments and discuss how existing FL backdoor defense mechanisms might adapt to this new type of attack. Additionally, the paper emphasizes the importance of guarding against such attacks in any situation where model unfairness could be beneficial.