Efficient Scheduling for Multi-Job Federated Learning Systems with Client Sharing

Boqian Fu,Fahao Chen,Peng Li,Zhou Su
DOI: https://doi.org/10.1109/dasc/picom/cbdcom/cy59711.2023.10361429
2023-01-01
Abstract:Federated Learning (FL) has emerged as a promising learning approch for data distributed across edge devices. Existing research mainly focuses on single-job FL systems. However, in practical scenarios, multiple FL jobs are often submitted simultaneously. Simply applying single-job optimizations to multi-job FL systems results in sub-optimal system performance. Specifically, we find considerably low resource utilization on the client side due to device heterogeneity. In this paper, we exploit opportunities in multi-job FL systems to improve resource utilization by client sharing: (1) clients not selected for one FL job could be allocated to another FL job, and (2) clients that complete their tasks early in one FL job could be preemptively assigned to another job. We propose an efficient scheduling algorithm for multi-job FL systems, namely GMFL. This scheduling algorithm promptly assigns an available job to a client as soon as it becomes available. To ensure training convergence, we carefully select jobs for each client while considering several constraints. We conduct experiments using four popular models across four different datasets to evaluate the performance of the proposed scheduling algorithm. Experimental results show that our proposed scheduling algorithm significantly outperforms existing methods, with a performance improvement of up to 2.03×.
What problem does this paper attempt to address?