Accelerating Hybrid Federated Learning Convergence under Partial Participation

Jieming Bian,Lei Wang,Kun Yang,Cong Shen,Jie Xu
2024-05-19
Abstract:Over the past few years, Federated Learning (FL) has become a popular distributed machine learning paradigm. FL involves a group of clients with decentralized data who collaborate to learn a common model under the coordination of a centralized server, with the goal of protecting clients' privacy by ensuring that local datasets never leave the clients and that the server only performs model aggregation. However, in realistic scenarios, the server may be able to collect a small amount of data that approximately mimics the population distribution and has stronger computational ability to perform the learning process. To address this, we focus on the hybrid FL framework in this paper. While previous hybrid FL work has shown that the alternative training of clients and server can increase convergence speed, it has focused on the scenario where clients fully participate and ignores the negative effect of partial participation. In this paper, we provide theoretical analysis of hybrid FL under clients' partial participation to validate that partial participation is the key constraint on convergence speed. We then propose a new algorithm called FedCLG, which investigates the two-fold role of the server in hybrid FL. Firstly, the server needs to process the training steps using its small amount of local datasets. Secondly, the server's calculated gradient needs to guide the participated clients' training and the server's aggregation. We validate our theoretical findings through numerical experiments, which show that our proposed method FedCLG outperforms state-of-the-art methods.
Distributed, Parallel, and Cluster Computing,Machine Learning
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to accelerate the convergence speed of Hybrid Federated Learning (HFL) in the case of Partial Participation. Specifically, the paper focuses on how to improve the training efficiency and accuracy of the model by using the small - scale data set of the server in the case of Non - IID data and partial client participation. ### Background and Motivation Traditional Federated Learning (FL) assumes that all data is stored on the client side, and the server is only responsible for model aggregation, which results in the waste of the powerful computing power of the server. In addition, in real - world scenarios, the server may have a small amount of data that can approximate the overall data distribution, and these data can be used to assist the training process. However, most of the existing Hybrid Federated Learning methods assume that the clients are fully participating and the data is independently and identically distributed, which is not always true in practical applications. Partial client participation and non - independent and identically distributed data will lead to a slower model convergence speed, so new methods are needed to solve these problems. ### Main Contributions of the Paper 1. **Theoretical Analysis**: - Provides the latest theoretical convergence analysis of Hybrid Federated Learning methods (such as CLG - SGD) in the case of non - independent and identically distributed data and partial participation. - The analysis shows that even with the increase of local training on the server, the partial participation error is still a key factor limiting the convergence speed. 2. **New Algorithm FedCLG**: - Proposes a new algorithm FedCLG, which aims to make full use of the small - scale data set of the server to correct partial participation errors, thereby accelerating model convergence. - FedCLG has two versions: FedCLG - S and FedCLG - C, which perform variance correction in the server aggregation step and the client local training step respectively. 3. **Experimental Verification**: - Through extensive experiments on two data sets, verifies the superior performance of FedCLG in the case of non - independent and identically distributed data and partial participation. ### Specific Methods - **FedCLG - C**: - The server selects a random subset of clients in each round and uses its small - scale data set to calculate the gradient \( g_s^t \). - Each selected client uses \( g_s^t \) for variance correction during local training to reduce partial participation errors. - After the client completes local training, it sends the update results to the server, and the server performs model aggregation and local training. - **FedCLG - S**: - The server selects a random subset of clients in each round and uses its small - scale data set to calculate the gradient \( g_s^t \). - After the client completes local training, it sends the update results and the gradient \( g_i^t \) to the server. - The server uses \( g_s^t \) and \( g_i^t \) for variance correction in the aggregation step to reduce partial participation errors. - The server performs model aggregation and local training. ### Theoretical Analysis - **Convergence Analysis**: - Through theoretical analysis, proves the convergence of FedCLG in the case of non - independent and identically distributed data and partial participation. - The analysis shows that FedCLG can significantly reduce the impact of partial participation errors, thereby accelerating model convergence. ### Experimental Results - **Performance Comparison**: - The experimental results show that FedCLG is superior to the existing latest methods in multiple indicators, especially in the case of non - independent and identically distributed data and partial participation. ### Conclusion The paper shows the effectiveness and superiority of FedCLG in Hybrid Federated Learning through theoretical analysis and experimental verification, especially in dealing with the model convergence problem in the case of non - independent and identically distributed data and partial participation.