FedDKD: Federated Learning with Decentralized Knowledge Distillation

Li Xinjia,Chen Boyu,Lu Wenlian
DOI: https://doi.org/10.1007/s10489-022-04431-1
IF: 5.3
2023-01-01
Applied Intelligence
Abstract:The heterogeneity of the data distribution generally influences federated learning performance in neural networks. For a well-performing global model, taking a weighted average of the local models, as in most existing federated learning algorithms, may not guarantee consistency with local models in the space of neural network maps. In this paper, we highlight the significance of the space of neural network maps to relieve the performance decay produced by data heterogeneity and propose a novel federated learning framework equipped with the decentralized knowledge distillation process (FedDKD). In FedDKD, we introduce a decentralized knowledge distillation (DKD) module to distill the knowledge of local models to teach the global model approaching the neural network map average by optimizing the divergence defined in the loss function, other than only averaging parameters as in the literature. Numerical experiments on various heterogeneous datasets reveal that FedDKD outperforms the state-of-the-art methods, especially on some extremely heterogeneous datasets.
What problem does this paper attempt to address?