Abstract:Federated learning (FL) is an emerging distributed machine learning paradigm that enables collaborative training of machine learning models over decentralized devices without exposing their local data. One of the major challenges in FL is the presence of uneven data distributions across client devices, violating the well-known assumption of independent-and-identically-distributed (IID) training samples in conventional machine learning. To address the performance degradation issue incurred by such data heterogeneity, clustered federated learning (CFL) shows its promise by grouping clients into separate learning clusters based on the similarity of their local data distributions. However, state-of-the-art CFL approaches require a large number of communication rounds to learn the distribution similarities during training until the formation of clusters is stabilized. Moreover, some of these algorithms heavily rely on a predefined number of clusters, thus limiting their flexibility and adaptability. In this paper, we propose {\em FedClust}, a novel approach for CFL that leverages the correlation between local model weights and the data distribution of clients. {\em FedClust} groups clients into clusters in a one-shot manner by measuring the similarity degrees among clients based on the strategically selected partial weights of locally trained models. We conduct extensive experiments on four benchmark datasets with different non-IID data settings. Experimental results demonstrate that {\em FedClust} achieves higher model accuracy up to $\sim$45\% as well as faster convergence with a significantly reduced communication cost up to 2.7$\times$ compared to its state-of-the-art counterparts.

Clients Help Clients: Alternating Collaboration for Semi-Supervised Federated Learning

Efficient Semi-Supervised Federated Learning for Heterogeneous Participants

Rethinking Semi-Supervised Federated Learning: How to co-train fully-labeled and fully-unlabeled client imaging data

Federated learning based on stratified sampling and regularization

SemiSFL: Split Federated Learning on Unlabeled and Non-IID Data

An Aggregation-Free Federated Learning for Tackling Data Heterogeneity

Enhancing Federated Learning with In-Cloud Unlabeled Data

FedSSA: Semantic Similarity-based Aggregation for Efficient Model-Heterogeneous Personalized Federated Learning

FedAR: Addressing Client Unavailability in Federated Learning with Local Update Approximation and Rectification

Enhancing Federated Learning with Server-Side Unlabeled Data by Adaptive Client and Data Selection

Towards Client Driven Federated Learning

Adaptive Clustered Federated Learning with Representation Similarity

Enhancing Edge-Assisted Federated Learning with Asynchronous Aggregation and Cluster Pairing

Federated Learning from Only Unlabeled Data with Class-Conditional-Sharing Clients

(FL)$^2$: Overcoming Few Labels in Federated Semi-Supervised Learning

FedSC: A federated learning algorithm based on client-side clustering

Towards Instance-adaptive Inference for Federated Learning

FedClust: Tackling Data Heterogeneity in Federated Learning through Weight-Driven Client Clustering

FedClust: Optimizing Federated Learning on Non-IID Data through Weight-Driven Client Clustering

Federated Learning under Heterogeneous and Correlated Client Availability

Federated Learning with Additional Mechanisms on Clients to Reduce Communication Costs