Abstract:Federated learning (FL) is important for privacy-preserving services by training models without collecting raw user data. Most FL algorithms assume all data is annotated, which is impractical due to the high cost of labeling data in real applications. To alleviate the reliance on labeled data, semi-supervised federated learning (SSFL) has been proposed to utilize unlabeled data on clients to improve model performance. However, most existing methods either have privacy issues which share models trained on other clients, or generate pseudo-labels for unlabeled local datasets with the global model, which is usually biased towards the global data distribution. The latter may lead to sub-optimal accuracy of pseudo-labels, due to the gap between the local data distribution and the global model, especially in non-IID settings. In this paper, we propose a semi-supervised heterogeneous federated learning method with local knowledge enhancement, called FedLoKe, which aims to train an accurate global model from both labeled and unlabeled local data with non-IID distributions. Specifically, in FedLoKe, the server maintains a global model to capture global data distribution, and each client learns a local model to capture local data distribution. Since the distribution captured by the local model is aligned with the local data distribution, we utilize it to generate high-accuracy pseudo-labels of the unlabeled dataset for global model training. To prevent the local model from severely overfitting local labeled data, we further use the exponential moving average and apply the global model to generate pseudo-labels for local modeling training. Experiments on four datasets show the effectiveness of FedLoKe. Our code is available at: https://github.com/zcfinal/FedLoKe.

Non-IID always Bad? Semi-Supervised Heterogeneous Federated Learning with Local Knowledge Enhancement

FedDGP: Disentangling Global and Personal Models for Federated Learning

Optimizing Federated Learning on Non-IID Data Using Local Shapley Value.

Federated Learning with Label Distribution Skew via Logits Calibration.

Privacy-Preserving Federated Learning Against Label-Flipping Attacks on Non-IID Data

Fine-tuning Global Model Via Data-Free Knowledge Distillation for Non-IID Federated Learning

Local Learning Matters: Rethinking Data Heterogeneity in Federated Learning

A Survey of Federated Learning on Non-IID Data

Learning Cautiously in Federated Learning with Noisy and Heterogeneous Clients

FedDistill: Global Model Distillation for Local Model De-Biasing in Non-IID Federated Learning

SemiSFL: Split Federated Learning on Unlabeled and Non-IID Data

One-Shot Sequential Federated Learning for Non-IID Data by Enhancing Local Model Diversity

FedEL: Federated ensemble learning for non-iid data

Completely Heterogeneous Federated Learning

Enhanced Federated Learning on Non-Iid Data Via Local Importance Sampling

FedDC: Federated Learning with Non-IID Data via Local Drift Decoupling and Correction

LF3PFL: A Practical Privacy-Preserving Federated Learning Algorithm Based on Local Federalization Scheme

Local-Global Knowledge Distillation in Heterogeneous Federated Learning with Non-IID Data

Joint Local Relational Augmentation and Global Nash Equilibrium for Federated Learning with Non-IID Data

Federated learning on non-IID and long-tailed data via dual-decoupling

Handling Data Heterogeneity in Federated Learning via Knowledge Distillation and Fusion