Abstract:Federated learning (FL) has recently gained tremendous attention in edge computing and Internet of Things, due to its capability of enabling distributed clients to cooperatively train models while keeping raw data locally. However, the existing works usually suffer from limited communication resources, dynamic network conditions and heterogeneous client properties, which hinder efficient FL. To simultaneously tackle the above challenges, we propose a heterogeneity-aware FL framework, called FedCG, with adaptive client selection and gradient compression. Specifically, FedCG introduces diversity to client selection and aims to select a representative client subset considering statistical heterogeneity. These selected clients are assigned different compression ratios based on heterogeneous and time-varying capabilities. After local training, they upload sparse model updates matching their capabilities for global aggregation, which can effectively reduce the communication cost and mitigate the straggler effect. More importantly, instead of naively combining client selection and gradient compression, we highlight that their decisions are tightly coupled and indicate the necessity of joint optimization. We theoretically analyze the impact of both client selection and gradient compression on convergence performance. Guided by the convergence rate, we develop an iteration-based algorithm to jointly optimize client selection and compression ratio decision using submodular maximization and linear programming. On this basis, we propose the quantized extension of FedCG, termed Q-FedCG, which further adjusts quantization levels based on gradient innovation. Extensive experiments on both real-world prototypes and simulations show that FedCG and its extension can provide up to 6.4× speedup.

GGS: General Gradient Sparsification for Federated Learning in Edge Computing*

Gradient-Congruity Guided Federated Sparse Training

Adaptive Gradient Sparsification for Efficient Federated Learning: An Online Learning Approach

Sparse Gradient Compression For Distributed Sgd

Adaptive Batchsize Selection and Gradient Compression for Wireless Federated Learning

Accelerating Federated Learning with Adaptive Extra Local Updates Upon Edge Networks

Learnable Sparse Customization in Heterogeneous Edge Computing

Federated Learning With Client Selection and Gradient Compression in Heterogeneous Edge Systems

DEFT: Exploiting Gradient Norm Difference between Model Layers for Scalable Gradient Sparsification

FedGSNR: Accelerating Federated Learning on Non-IID Data via Maximum Gradient Signal to Noise Ratio

Stochastic gradient compression for federated learning over wireless network

CG-FedLLM: How to Compress Gradients in Federated Fune-tuning for Large Language Models

Recycling Model Updates in Federated Learning: Are Gradient Subspaces Low-Rank?

FedGKD: Towards Heterogeneous Federated Learning via Global Knowledge Distillation

Communication-Efficient Distributed Learning via Sparse and Adaptive Stochastic Gradient

Joint Dynamic Grouping and Gradient Coding for Time-Critical Distributed Machine Learning in Heterogeneous Edge Networks

Adaptive Federated Learning in Resource Constrained Edge Computing Systems

Efficient Asynchronous Vertical Federated Learning via Gradient Prediction and Double-End Sparse Compression

Federated Split Learning with Model Pruning and Gradient Quantization in Wireless Networks

Toward Efficient Federated Learning in Multi-Channeled Mobile Edge Network with Layerd Gradient Compression

Preserving Near-Optimal Gradient Sparsification Cost for Scalable Distributed Deep Learning