Global prototype distillation for heterogeneous federated learning

Shu Wu,Jindou Chen,Xueli Nie,Yong Wang,Xiancun Zhou,Linlin Lu,Wei Peng,Yao Nie,Waseef Menhaj
DOI: https://doi.org/10.1038/s41598-024-62908-0
IF: 4.6
2024-05-29
Scientific Reports
Abstract:Federated learning is a distributed machine learning paradigm where the goal is to collaboratively train a high quality global model while private training data remains local over distributed clients. However, heterogenous data distribution over clients is severely challenging for federated learning system, which severely damage the quality of model. In order to address this challenge, we propose global prototype distillation (FedGPD) for heterogenous federated learning to improve performance of global model. The intuition is to use global class prototypes as knowledge to instruct local training on client side. Eventually, local objectives will be consistent with the global optima so that FedGPD learns an improved global model. Experiments show that FedGPD outperforms previous state-of-art methods by 0.22% ~1.28% in terms of average accuracy on representative benchmark datasets.
multidisciplinary sciences
What problem does this paper attempt to address?
### Problems the Paper Attempts to Solve The paper aims to address the issue of data heterogeneity in Federated Learning (FL). In Federated Learning, multiple clients train models locally and then send the model parameters to a central server for aggregation to generate a global model. However, due to the typically inconsistent data distribution (Non-IID) across clients, this data heterogeneity severely impairs the quality of the global model. Specifically, imbalanced data distribution leads to biases in local model training, causing each client's local objective to deviate from the global optimum, thereby significantly reducing the performance of the Federated Learning system. To tackle this challenge, the authors propose a novel method called Global Prototype Distillation (FedGPD). This method uses global class prototypes as knowledge to guide the local training of clients, aligning local objectives with the global objective, thereby improving the performance of the global model. Experimental results show that FedGPD improves the average accuracy on multiple benchmark datasets by 0.22% to 1.28% compared to existing state-of-the-art methods.