Abstract:Federated Learning is designed for training models using data distributed across multiple Internet of Things(IoT) devices or servers, reducing data transfer overhead and ensuring data security. However, the decentralization and diversity of IoT devices introduce statistical and system heterogeneity, which can lead to unstable model training and even system crashes. Although many studies attribute performance issues to client-drift caused by this heterogeneity, there’s a lack of insight into how different forms of heterogeneity impact local model gradient variations and model convergence. In this paper, we investigate model gradient distribution characteristics in heterogeneous training. We find that the challenge isn’t solely due to client-drift but is also closely linked to a high degree of model overfitting, which negatively affects local model training and equilibrium convergence. To address this challenge, we introduce an efficient framework called GOFL. First, GOFL incorporates the Federated Gradient Normalization (FGN) technique to maintain gradient distribution consistency while mitigating client-drift stemming from heterogeneity. We also highlight the benefits of FGN in reducing local model overfitting and improving convergence. Secondly, GOFL introduces the Federated Device Aggregation (FDA) strategy, a critical addition to FGN. It adaptively guides device selection and aggregation based on device contributions, ensuring a more balanced training approach in the face of system heterogeneity. The experimental results demonstrate that GOFL achieves state-of-the-art training accuracy while reducing the number of training rounds. In particular, it improves the accuracy of the classical FL framework FedAvg by 30.57% and reduces the number of convergence rounds by 5.17 times.

Gradient Rotation Unit for Non-I.I.D. Federated Learning

UniGrad-FS: Unified Gradient Projection with Flatter Sharpness for Continual Learning

FedDGP: Disentangling Global and Personal Models for Federated Learning

Understanding the Training Dynamics in Federated Deep Learning via Aggregation Weight Optimization

DRAG: Divergence-based Adaptive Aggregation in Federated Learning on Non-IID Data

Generalized Federated Learning via Gradient Norm-Aware Minimization and Control Variables

FedGSNR: Accelerating Federated Learning on Non-IID Data via Maximum Gradient Signal to Noise Ratio

FedAgg: Adaptive Federated Learning with Aggregated Gradients

Federated Learning with Unbiased Gradient Aggregation and Controllable Meta Updating

Gradient Masked Averaging for Federated Learning

Byzantine-resilient Federated Learning Employing Normalized Gradients on Non-IID Datasets

DegaFL: Decentralized Gradient Aggregation for Cross-silo Federated Learning

GPFL: A Gradient Projection-Based Client Selection Framework for Efficient Federated Learning

Gradient-Congruity Guided Federated Sparse Training

Gradient Free Personalized Federated Learning

GOFL: an Accurate and Efficient Federated Learning Framework Based on Gradient Optimization in Heterogeneous IoT Systems

Federated Learning with Unbiased Gradient Aggregation and Controllable Meta Updating

Accelerating Federated Learning by Selecting Beneficial Herd of Local Gradients

CG-FedLLM: How to Compress Gradients in Federated Fune-tuning for Large Language Models

Adaptive Gradient Sparsification for Efficient Federated Learning: An Online Learning Approach

GAIN: Enhancing Byzantine Robustness in Federated Learning with Gradient Decomposition