FedImpro: Measuring and Improving Client Update in Federated Learning

Zhenheng Tang,Yonggang Zhang,Shaohuai Shi,Xinmei Tian,Tongliang Liu,Bo Han,Xiaowen Chu
2024-03-14
Abstract:Federated Learning (FL) models often experience client drift caused by heterogeneous data, where the distribution of data differs across clients. To address this issue, advanced research primarily focuses on manipulating the existing gradients to achieve more consistent client models. In this paper, we present an alternative perspective on client drift and aim to mitigate it by generating improved local models. First, we analyze the generalization contribution of local training and conclude that this generalization contribution is bounded by the conditional Wasserstein distance between the data distribution of different clients. Then, we propose FedImpro, to construct similar conditional distributions for local training. Specifically, FedImpro decouples the model into high-level and low-level components, and trains the high-level portion on reconstructed feature distributions. This approach enhances the generalization contribution and reduces the dissimilarity of gradients in FL. Experimental results show that FedImpro can help FL defend against data heterogeneity and enhance the generalization performance of the model.
Artificial Intelligence,Distributed; Parallel; and Cluster Computing
What problem does this paper attempt to address?
The paper mainly addresses the issues caused by non-independent and identically distributed (Non-IID) data in Federated Learning (FL), particularly the phenomenon of client drift. Specifically, the paper aims to solve the following core problems: 1. **Client Drift Problem**: In a federated learning environment, the differences in data distribution (heterogeneity) among different clients lead to a decline in the performance of the trained model, especially in terms of generalization ability. Through theoretical analysis, the paper finds that client drift is the main reason for this performance decline. 2. **Improving Client Updates**: Existing research mostly focuses on reducing the inconsistency between client models by manipulating gradients. However, these methods still have a performance gap compared to centralized training. Therefore, the paper proposes a new perspective, which is to alleviate the client drift problem by generating improved local models. To address the above issues, the paper proposes a method called FedImpro, whose main contributions include: - **Theoretical Contribution**: The paper first analyzes the contribution of local training to generalization performance and proves that this contribution is constrained by the conditional Wasserstein distance between client data distributions. This means that even if the marginal distributions are the same, it is not sufficient to ensure good generalization performance. - **Technical Contribution**: To overcome privacy limitations, FedImpro decomposes deep neural networks into two parts: a lower-level model and a higher-level model. The lower-level model is responsible for feature extraction, while the higher-level model is trained based on the reconstructed feature distribution. This method promotes local training under similar conditional distributions, thereby improving generalization contribution and reducing gradient dissimilarity. - **Experimental Validation**: The paper validates the effectiveness of FedImpro through a series of experiments, especially in handling highly heterogeneous data. The experimental results show that FedImpro can significantly improve the generalization performance of the model in various federated learning settings. In summary, through theoretical analysis and technical innovation, this paper provides an effective solution to address the client drift problem in federated learning, particularly in scenarios with high data heterogeneity.