FedTweet: Two-fold Knowledge Distillation for non-IID Federated Learning

Yanhan Wang,Wenting Wang,Xin Wang,Heng Zhang,Xiaoming Wu,Ming Yang
DOI: https://doi.org/10.1016/j.compeleceng.2023.109067
IF: 4.152
2024-01-26
Computers & Electrical Engineering
Abstract:Federated Learning (FL) is a distributed learning approach that allows each client to retain its original data locally and share only the parameters of the local updates with the server. While FL can mitigate the problem of "data islands", the training process involving non-independent and identically distributed (non-IID) data still faces the formidable challenge of model performance degradation due to "client drift" in practical applications. To address this challenge, in this paper, we design a novel approach termed " Tw o-fold Knowl e dg e Dis t illation for non-IID Fed erated Learning" ( FedTweet ), meticulously designed for the personalized training of both local and global models within various heterogeneous data contexts. Specifically, the server employs global pseudo-data for fine-tuning the initial aggregated model through knowledge distillation and adopts dynamic aggregation weights for local generators based on model similarity to ensure diversity in global pseudo-data. Clients freeze the received global model as a teacher model and conduct adversarial training between the local model and local generator, thus preserving the personalized information in the local updates while correcting their directions. FedTweet enables both global and local models to serve as teacher models for each other, ensuring bidirectional guarantees for personalization and generalization. Finally, extensive experiments conducted on benchmark datasets demonstrate that FedTweet outperforms several previous FL methods on heterogeneous datasets.
engineering, electrical & electronic,computer science, interdisciplinary applications, hardware & architecture
What problem does this paper attempt to address?