Abstract:In a practical setting, how to enable robust Federated Learning (FL) systems, both in terms of generalization and personalization abilities, is one important research question. It is a challenging issue due to the consequences of non-i.i.d. properties of client's data, often referred to as statistical heterogeneity, and small local data samples from the various data distributions. Therefore, to develop robust generalized global and personalized models, conventional FL methods need to redesign the knowledge aggregation from biased local models while considering huge divergence of learning parameters due to skewed client data. In this work, we demonstrate that the knowledge transfer mechanism achieves these objectives and develop a novel knowledge distillation-based approach to study the extent of knowledge transfer between the global model and local models. Henceforth, our method considers the suitability of transferring the outcome distribution and (or) the embedding vector of representation from trained models during cross-device knowledge transfer using a small proxy dataset in heterogeneous FL. In doing so, we alternatively perform cross-device knowledge transfer following general formulations as 1) global knowledge transfer and 2) on-device knowledge transfer. Through simulations on three federated datasets, we show the proposed method achieves significant speedups and high personalized performance of local models. Furthermore, the proposed approach offers a more stable algorithm than other baselines during the training, with minimal communication data load when exchanging the trained model's outcomes and representation.

What problem does this paper attempt to address?

### What problem does this paper attempt to solve? This paper primarily focuses on how to achieve robust generalization and personalization capabilities in Federated Learning (FL) systems. Specifically, the paper attempts to address the following key issues: 1. **Non-Independent and Identically Distributed (non-i.i.d.) Data Problem**: - Client data usually exhibits statistical heterogeneity, leading to poor model training performance. 2. **Small Sample Local Dataset Problem**: - The data distribution across clients is different, and the sample size is small, which can result in biased models. 3. **Model Parameter Aggregation Problem**: - Traditional federated learning methods rely on coordinate-based averaging for model parameter aggregation. This approach can lead to significant differences in model parameters under highly biased data, thereby slowing down the learning process. To address these issues, the paper proposes a new Cross-Device Knowledge Transfer (CDKT) mechanism. This method aims to achieve its goals through the following two mechanisms: 1. **Global Model Construction**: - Transfer the knowledge from client models to the global model to improve the generalization capability of the global model. 2. **Local Device Learning**: - Transfer the knowledge from the global model to client models to enhance the personalization performance of client models. Specifically, the CDKT-FL method utilizes a proxy dataset for knowledge transfer. The proxy dataset is a small-scale dataset available to clients, used to convey the output distribution or embedded representation features of the model. This approach not only improves model performance but also reduces communication load and enhances user data privacy protection. Experimental results show that this method demonstrates significant speed improvements and personalization performance enhancements on multiple benchmark datasets.

CDKT-FL: Cross-Device Knowledge Transfer using Proxy Dataset in Federated Learning

CDKT-FL: Cross-device knowledge transfer using proxy dataset in federated learning

MCKD: Mutually Collaborative Knowledge Distillation for Federated Domain Adaptation and Generalization

pFedKT: Personalized federated learning with dual knowledge transfer

A Prototype-Based Knowledge Distillation Framework for Heterogeneous Federated Learning

A Hierarchical Knowledge Transfer Framework for Heterogeneous Federated Learning.

Data-Free Knowledge Distillation for Heterogeneous Federated Learning

A Personalized Federated Learning Method Based on Clustering and Knowledge Distillation

Personalized and privacy-enhanced federated learning framework via knowledge distillation

FedD2S: Personalized Data-Free Federated Knowledge Distillation

Handling Data Heterogeneity in Federated Learning via Knowledge Distillation and Fusion

Digital Twin-Assisted Knowledge Distillation Framework for Heterogeneous Federated Learning

The Best of Both Worlds: Accurate Global and Personalized Models through Federated Learning with Data-Free Hyper-Knowledge Distillation

Fine-tuning Global Model Via Data-Free Knowledge Distillation for Non-IID Federated Learning

Parameterized Knowledge Transfer for Personalized Federated Learning

FedSiKD: Clients Similarity and Knowledge Distillation: Addressing Non-i.i.d. and Constraints in Federated Learning

Tailored Federated Learning: Leveraging Direction Regulation & Knowledge Distillation

Personalized Federated Learning with Adaptive Feature Aggregation and Knowledge Transfer

FedZKT: Zero-Shot Knowledge Transfer towards Resource-Constrained Federated Learning with Heterogeneous On-Device Models

Federated Split Learning Via Mutual Knowledge Distillation

Decentralized Federated Learning through Proxy Model Sharing