One Teacher is Enough: A Server-Clueless Federated Learning with Knowledge Distillation

Wanyi Ning,Qi,Jingyu Wang,Mengde Zhu,Shaolong Li,Guang Yang,Jianxin Liao
DOI: https://doi.org/10.1109/tsc.2024.3414372
2024-01-01
Abstract:Machine learning-based services offer intelligent solutions with powerful models. To enhance model robustness, Federated Learning (FL) emerges as a promising collaborative learning paradigm, which iteratively trains a global model through parameter exchange among multiple clients based on their local data. Generally, the local data are heterogeneous, which slows down convergence. Knowledge distillation is an effective technique against data heterogeneity while existing works distill the ensemble knowledge from local models, ignoring the natural global knowledge from the aggregated model. This places limitations on their algorithms, such as the need for proxy data or the necessary exposure of local models to the server, which is prohibited in most privacy-preserving FL with a clueless server. In this work, we propose FedDGT, a novel knowledge distillation method for industrial server-clueless FL. FedDGT regards the aggregated model as the only one teacher to impart its global knowledge into a generator and then regularizes the drifted local models through the generator, overcoming previous limitations and providing better privacy and scalability support. Extensive experiments demonstrate that FedDGT can achieve highly-competitive model performance while greatly reducing the communication rounds in a server-clueless scenario.
What problem does this paper attempt to address?