Abstract:Federated learning (FL) is important for privacy-preserving services by training models without collecting raw user data. Most FL algorithms assume all data is annotated, which is impractical due to the high cost of labeling data in real applications. To alleviate the reliance on labeled data, semi-supervised federated learning (SSFL) has been proposed to utilize unlabeled data on clients to improve model performance. However, most existing methods either have privacy issues which share models trained on other clients, or generate pseudo-labels for unlabeled local datasets with the global model, which is usually biased towards the global data distribution. The latter may lead to sub-optimal accuracy of pseudo-labels, due to the gap between the local data distribution and the global model, especially in non-IID settings. In this paper, we propose a semi-supervised heterogeneous federated learning method with local knowledge enhancement, called FedLoKe, which aims to train an accurate global model from both labeled and unlabeled local data with non-IID distributions. Specifically, in FedLoKe, the server maintains a global model to capture global data distribution, and each client learns a local model to capture local data distribution. Since the distribution captured by the local model is aligned with the local data distribution, we utilize it to generate high-accuracy pseudo-labels of the unlabeled dataset for global model training. To prevent the local model from severely overfitting local labeled data, we further use the exponential moving average and apply the global model to generate pseudo-labels for local modeling training. Experiments on four datasets show the effectiveness of FedLoKe. Our code is available at: https://github.com/zcfinal/FedLoKe.

FedRS: Federated Learning with Restricted Softmax for Label Distribution Non-IID Data

Federated Learning with Label Distribution Skew via Logits Calibration.

Optimizing Federated Learning on Non-IID Data Using Local Shapley Value.

FedDistill: Global Model Distillation for Local Model De-Biasing in Non-IID Federated Learning

Privacy-Preserving Federated Learning Against Label-Flipping Attacks on Non-IID Data

Non-IID always Bad? Semi-Supervised Heterogeneous Federated Learning with Local Knowledge Enhancement

Federated Learning with Label-Masking Distillation

Federated Learning with Extreme Label Skew: A Data Extension Approach

Federated Learning with Soft Clustering

Personalized federated learning based on feature fusion

One-Shot Federated Learning with Label Differential Privacy

Stabilizing and Improving Federated Learning with Non-IID Data and Client Dropout

DistFL: Distribution-aware Federated Learning for Mobile Scenarios

FedBN: Federated Learning on Non-IID Features via Local Batch Normalization

One-shot Federated Learning via Synthetic Distiller-Distillate Communication

FedSLD: Federated Learning with Shared Label Distribution for Medical Image Classification

FedBnR: Mitigating federated learning Non-IID problem by breaking the skewed task and reconstructing representation

FedWon: Triumphing Multi-domain Federated Learning Without Normalization

Dataset Distillation-based Hybrid Federated Learning on Non-IID Data

Federated Learning with Instance-Dependent Noisy Label

Rethinking Client Drift in Federated Learning: A Logit Perspective