Abstract:Federated Learning (FL) marks a transformative approach to distributed model training by combining locally optimized models from various clients into a unified global model. While FL preserves data privacy by eliminating centralized storage, it encounters significant challenges such as performance degradation, slower convergence, and reduced robustness of the global model due to the heterogeneity in client data distributions. Among the various forms of data heterogeneity, label skew emerges as a particularly formidable and prevalent issue, especially in domains such as image classification. To address these challenges, we begin with comprehensive experiments to pinpoint the underlying issues in the FL training process. Based on our findings, we then introduce an innovative dual-strategy approach designed to effectively resolve these issues. First, we introduce an adaptive loss function for client-side training, meticulously crafted to preserve previously acquired knowledge while maintaining an optimal equilibrium between local optimization and global model coherence. Secondly, we develop a dynamic aggregation strategy for aggregating client models at the server. This approach adapts to each client's unique learning patterns, effectively addressing the challenges of diverse data across the network. Our comprehensive evaluation, conducted across three diverse real-world datasets, coupled with theoretical convergence guarantees, demonstrates the superior efficacy of our method compared to several established state-of-the-art approaches.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is the data heterogeneity problem in Federated Learning (FL), especially issues such as performance degradation, slow convergence speed, and reduced global model robustness caused by label skew. Specifically: 1. **Challenges of data heterogeneity**: In federated learning, due to the large differences in data distribution among different clients, especially in tasks such as image classification, the label skew phenomenon is particularly serious. This data heterogeneity can lead to "sharp minima" during the training process, thus affecting the generalization ability and stability of the model. 2. **Limitations of existing methods**: Traditional static aggregation methods (such as FedAvg) are not effective in handling non - independent and identically distributed (non - IID) data and are difficult to adapt to dynamically changing data distributions and client drift. These methods are usually unable to effectively deal with complex non - IID data distributions, resulting in a decline in model performance. To solve the above problems, the paper proposes a dual - strategy method named FedDUAL, which specifically includes: - **Adaptive loss function**: An adaptive loss function is introduced during client training. By adjusting the trade - off between the local and global models, it ensures local optimization while maintaining the consistency of the global model. This loss function combines cross - entropy loss and Kullback - Leibler (KL) divergence to quantify the probability distribution differences between local and global model weights. - **Dynamic aggregation strategy**: A dynamic aggregation method based on Wasserstein Barycenter is adopted on the server side to optimize the gradients of the final layer. This method can better integrate the learning behaviors of different clients, reduce the negative impacts brought by non - IID data, and thus improve the stability and generalization ability of the model. Through these two strategies, FedDUAL can develop more robust and general - purpose federated models in highly heterogeneous data environments, significantly improving the performance and convergence speed of the model. Experimental results show that this method outperforms the existing state - of - the - art methods on multiple real - world datasets.

FedDUAL: A Dual-Strategy with Adaptive Loss and Dynamic Aggregation for Mitigating Data Heterogeneity in Federated Learning

FedDGP: Disentangling Global and Personal Models for Federated Learning

Understanding the Training Dynamics in Federated Deep Learning via Aggregation Weight Optimization

An Aggregation-Free Federated Learning for Tackling Data Heterogeneity

FedSiam-DA: Dual-aggregated Federated Learning Via Siamese Network for Non-Iid Data

FedShift: Tackling Dual Heterogeneity Problem of Federated Learning via Weight Shift Aggregation

FedDA: Resource-adaptive Federated Learning with Dual-Alignment Aggregation Optimization for Heterogeneous Edge Devices

CDFed: Contribution-based Dynamic Federated Learning for Managing System and Statistical Heterogeneity

Adaptive Self-Distillation for Minimizing Client Drift in Heterogeneous Federated Learning

Federated learning on non-IID and long-tailed data via dual-decoupling

Dual-Criterion Model Aggregation in Federated Learning: Balancing Data Quantity and Quality

Enhancing Federated Learning Convergence with Dynamic Data Queue and Data Entropy-driven Participant Selection

Tackling Data Heterogeneity in Federated Learning via Loss Decomposition

Heterogeneity-Aware Federated Learning with Adaptive Client Selection and Gradient Compression.

FedADMM: A Robust Federated Deep Learning Framework with Adaptivity to System Heterogeneity

Handling Data Heterogeneity in Federated Learning via Knowledge Distillation and Fusion

FedFed: Feature Distillation Against Data Heterogeneity in Federated Learning

A-FedPD: Aligning Dual-Drift is All Federated Primal-Dual Learning Needs

Local Learning Matters: Rethinking Data Heterogeneity in Federated Learning

Gradient Masked Averaging for Federated Learning

Enhancing Edge-Assisted Federated Learning with Asynchronous Aggregation and Cluster Pairing