A Resource-Adaptive Approach for Federated Learning under Resource-Constrained Environments

Ruirui Zhang,Xingze Wu,Yifei Zou,Zhenzhen Xie,Peng Li,Xiuzhen Cheng,Dongxiao Yu
2024-06-19
Abstract:The paper studies a fundamental federated learning (FL) problem involving multiple clients with heterogeneous constrained resources. Compared with the numerous training parameters, the computing and communication resources of clients are insufficient for fast local training and real-time knowledge sharing. Besides, training on clients with heterogeneous resources may result in the straggler problem. To address these issues, we propose Fed-RAA: a Resource-Adaptive Asynchronous Federated learning algorithm. Different from vanilla FL methods, where all parameters are trained by each participating client regardless of resource diversity, Fed-RAA adaptively allocates fragments of the global model to clients based on their computing and communication capabilities. Each client then individually trains its assigned model fragment and asynchronously uploads the updated result. Theoretical analysis confirms the convergence of our approach. Additionally, we design an online greedy-based algorithm for fragment allocation in Fed-RAA, achieving fairness comparable to an offline strategy. We present numerical results on MNIST, CIFAR-10, and CIFAR-100, along with necessary comparisons and ablation studies, demonstrating the advantages of our work. To the best of our knowledge, this paper represents the first resource-adaptive asynchronous method for fragment-based FL with guaranteed theoretical convergence.
Machine Learning,Artificial Intelligence,Distributed, Parallel, and Cluster Computing
What problem does this paper attempt to address?
The paper mainly addresses the issue of client resource constraints in Federated Learning (FL) by proposing a resource-adaptive asynchronous federated learning algorithm (Fed-RAA). Specifically, the paper attempts to solve the following problems: 1. **Client Resource Constraints**: In federated learning scenarios, the client devices participating in training often have varying computational and communication capabilities. These limited resources pose challenges for fast local training and real-time knowledge sharing. 2. **Resource Allocation and Fairness in Asynchronous Federated Learning**: Traditional federated learning methods typically use a synchronous mode for model aggregation, which requires waiting for all clients to complete local training before updating the global model. This method is affected by the slowest client (usually the one with the least resources), leading to reduced overall training efficiency. Additionally, how to fairly allocate model fragments to clients with different capabilities is also an important issue. To address the above problems, the paper proposes the Fed-RAA algorithm, whose main contributions include: - **Resource-Adaptive Model Fragment Allocation**: Dynamically allocates different parts of the global model to each client for training based on their computational and communication capabilities. - **Asynchronous Model Aggregation**: Clients can upload their model fragments immediately after completing local training, and the server updates the global model accordingly without waiting for all clients to finish training. - **Fairness and Theoretical Convergence Guarantee**: An online greedy algorithm is designed for model fragment allocation to ensure fairness among clients, and the theoretical convergence of the algorithm is proven. Experimental results show that compared to existing federated learning algorithms, Fed-RAA achieves faster convergence speed and higher accuracy on the MNIST, CIFAR-10, and CIFAR-100 datasets, especially in resource-constrained environments. Additionally, ablation studies validate the effectiveness of each component of Fed-RAA and the impact of parameter settings on the algorithm's performance.