Abstract:Federated Learning (FL) models often experience client drift caused by heterogeneous data, where the distribution of data differs across clients. To address this issue, advanced research primarily focuses on manipulating the existing gradients to achieve more consistent client models. In this paper, we present an alternative perspective on client drift and aim to mitigate it by generating improved local models. First, we analyze the generalization contribution of local training and conclude that this generalization contribution is bounded by the conditional Wasserstein distance between the data distribution of different clients. Then, we propose FedImpro, to construct similar conditional distributions for local training. Specifically, FedImpro decouples the model into high-level and low-level components, and trains the high-level portion on reconstructed feature distributions. This approach enhances the generalization contribution and reduces the dissimilarity of gradients in FL. Experimental results show that FedImpro can help FL defend against data heterogeneity and enhance the generalization performance of the model.

What problem does this paper attempt to address?

The paper mainly addresses the issues caused by non-independent and identically distributed (Non-IID) data in Federated Learning (FL), particularly the phenomenon of client drift. Specifically, the paper aims to solve the following core problems: 1. **Client Drift Problem**: In a federated learning environment, the differences in data distribution (heterogeneity) among different clients lead to a decline in the performance of the trained model, especially in terms of generalization ability. Through theoretical analysis, the paper finds that client drift is the main reason for this performance decline. 2. **Improving Client Updates**: Existing research mostly focuses on reducing the inconsistency between client models by manipulating gradients. However, these methods still have a performance gap compared to centralized training. Therefore, the paper proposes a new perspective, which is to alleviate the client drift problem by generating improved local models. To address the above issues, the paper proposes a method called FedImpro, whose main contributions include: - **Theoretical Contribution**: The paper first analyzes the contribution of local training to generalization performance and proves that this contribution is constrained by the conditional Wasserstein distance between client data distributions. This means that even if the marginal distributions are the same, it is not sufficient to ensure good generalization performance. - **Technical Contribution**: To overcome privacy limitations, FedImpro decomposes deep neural networks into two parts: a lower-level model and a higher-level model. The lower-level model is responsible for feature extraction, while the higher-level model is trained based on the reconstructed feature distribution. This method promotes local training under similar conditional distributions, thereby improving generalization contribution and reducing gradient dissimilarity. - **Experimental Validation**: The paper validates the effectiveness of FedImpro through a series of experiments, especially in handling highly heterogeneous data. The experimental results show that FedImpro can significantly improve the generalization performance of the model in various federated learning settings. In summary, through theoretical analysis and technical innovation, this paper provides an effective solution to address the client drift problem in federated learning, particularly in scenarios with high data heterogeneity.

FedImpro: Measuring and Improving Client Update in Federated Learning

Understanding the Training Dynamics in Federated Deep Learning via Aggregation Weight Optimization

FedDGP: Disentangling Global and Personal Models for Federated Learning

FedDC: Federated Learning with Non-IID Data via Local Drift Decoupling and Correction

Harnessing Client Drift with Decoupled Gradient Dissimilarity

Disentangling Client Contributions: Improving Federated Learning Accuracy in the Presence of Heterogeneous Data

Rethinking Client Drift in Federated Learning: A Logit Perspective

Improving Generalization and Personalization in Model-Heterogeneous Federated Learning

FedBoost: Bayesian Estimation based Client Selection for Federated Learning

Heterogeneity-Aware Federated Learning with Adaptive Client Selection and Gradient Compression.

FedDistill: Global Model Distillation for Local Model De-Biasing in Non-IID Federated Learning

Fine-tuning Global Model Via Data-Free Knowledge Distillation for Non-IID Federated Learning

Understanding and Improving Model Averaging in Federated Learning on Heterogeneous Data

Stabilizing and Accelerating Federated Learning on Heterogeneous Data With Partial Client Participation

Towards Instance-adaptive Inference for Federated Learning

FedMoS: Taming Client Drift in Federated Learning with Double Momentum and Adaptive Selection.

FedAgg: Adaptive Federated Learning with Aggregated Gradients

FedAR: Addressing Client Unavailability in Federated Learning with Local Update Approximation and Rectification

MimiC: Combating Client Dropouts in Federated Learning by Mimicking Central Updates

Federated Learning with Unbiased Gradient Aggregation and Controllable Meta Updating

An Aggregation-Free Federated Learning for Tackling Data Heterogeneity