CareFL: Contribution Guided Byzantine-Robust Federated Learning
Qihao Dong,Shengyuan Yang,Zhiyang Dai,Yansong Gao,Shang Wang,Yuan Cao,Anmin Fu,Willy Susilo
DOI: https://doi.org/10.1109/tifs.2024.3477912
IF: 7.231
2024-01-01
IEEE Transactions on Information Forensics and Security
Abstract:Byzantine-robust federated learning (FL) endeavors to empower service providers in acquiring a precise global model, even in the presence of potentially malicious FL clients. While considerable strides have been taken in the development of robust aggregation algorithms for FL in recent years, their efficacy is confined to addressing particular forms of Byzantine attacks, and they exhibit vulnerabilities when confronted with a spectrum of attack vectors. Notably, a prevailing issue lies in the heavy reliance of these algorithms on the examination of local model gradients. It is worth noting that an attacker possesses the ability to manipulate a carefully chosen small gradient of a model within a context where there could be millions of gradients available, thereby facilitating adaptive attacks. Drawing inspiration from the foundational Shapley value methodology in game theory, we introduce an effective FL scheme named CareFL. This scheme is designed to provide robustness against a spectrum of state-of-the-art Byzantine attacks. Unlike approaches that rely on the examination of gradients, CareFL employs a universal metric, the loss of the local model-independent of specific gradients, to identify potentially malicious clients. Specifically, in each aggregation round, the FL server trains a reference model using a small auxiliary dataset- the auxiliary dataset can be removed with a slight defense degradation trade-off. It employs the Shapley value to assess the contribution of each client-submitted model in minimizing the global model loss. Subsequently, the server selects client models closer to the reference model in terms of Shapley values for the global model update. To reduce the computational overhead of CareFL when the number of clients is relatively scaled-up, we construct its variant, namely CareFL+ generally by grouping clients. Extensive experimentation conducted on well-established MNIST and CIFAR-10 datasets, encompassing diverse model architectures, including AlexNet, demonstrates that CareFL consistently achieves accuracy levels comparable to those attained under attack-free conditions when faced with five formidable attacks. CareFL and CareFL+ outperform six existing state-of-the-art Byzantine-robust FL aggregation methods, including FLTrust, across both IID and non-IID data distribution settings.