CoopFL: Accelerating federated learning with DNN partitioning and offloading in heterogeneous edge computing
Zhiyuan Wang,Hongli Xu,Yang Xu,Zhida Jiang,Jianchun Liu
DOI: https://doi.org/10.1016/j.comnet.2022.109490
IF: 5.493
2022-11-26
Computer Networks
Abstract:Federated learning (FL), a novel distributed machine learning (DML) approach, has been widely adopted to train deep neural networks (DNNs), over massive data in edge computing. However, the existing FL systems often lead to a long training time due to resource limitation and system heterogeneity ( e.g. , computing, communication and memory) in edge computing. To this end, we design and implement an FL system, called CoopFL, which trains DNNs through the cooperation between devices and edge servers. Specifically, we implement DNN partitioning and offloading techniques in CoopFL, which enables each device to train a partial DNN model and offload the intermediate data outputted by some hidden layers to proper edge servers for cooperative training. However, some empirical partitioning and offloading strategies in previous works may not exploit the system resource well or even slow down the training procedure. To this end, we give a problem definition considering the resource constraints and system heterogeneity, and then propose an efficient algorithm to solve this problem so as to accelerate the training procedure by the developed DNN partitioning and offloading strategy. Extensive experiments on the classical models and datasets show the high effectiveness of our system. For example, CoopFL achieves a speedup of 2.3-4.9×, compared with the baselines, including hierarchical federated learning system (HFL), typical federated learning system (TFL), and two systems with empirical DNN partitioning, i.e. , FedMEC and HFLP.
computer science, information systems,telecommunications,engineering, electrical & electronic, hardware & architecture