Resource-efficient Parallel Split Learning in Heterogeneous Edge Computing

Mingjin Zhang,Jiannong Cao,Yuvraj Sahni,Xiangchun Chen,Shan Jiang

2024-03-23

Abstract:Edge AI has been recently proposed to facilitate the training and deployment of Deep Neural Network (DNN) models in proximity to the sources of data. To enable the training of large models on resource-constraint edge devices and protect data privacy, parallel split learning is becoming a practical and popular approach. However, current parallel split learning neglects the resource heterogeneity of edge devices, which may lead to the straggler issue. In this paper, we propose EdgeSplit, a novel parallel split learning framework to better accelerate distributed model training on heterogeneous and resource-constraint edge devices. EdgeSplit enhances the efficiency of model training on less powerful edge devices by adaptively segmenting the model into varying depths. Our approach focuses on reducing total training time by formulating and solving a task scheduling problem, which determines the most efficient model partition points and bandwidth allocation for each device. We employ a straightforward yet effective alternating algorithm for this purpose. Comprehensive tests conducted with a range of DNN models and datasets demonstrate that EdgeSplit not only facilitates the training of large models on resource-restricted edge devices but also surpasses existing baselines in performance.

Distributed, Parallel, and Cluster Computing

What problem does this paper attempt to address?

The paper aims to address the issue of efficient distributed model training on resource-constrained and heterogeneous edge devices. Specifically, the paper proposes the EdgeSplit framework, a novel parallel split learning framework that improves the efficiency of distributed model training through the following methods: 1. **Adaptive Model Splitting**: Dynamically splits the complete model into parts of different depths based on the heterogeneous computational capabilities of edge devices, thereby optimizing the model training tasks on each device. 2. **Task Scheduling and Bandwidth Allocation**: Determines the most effective model split points and bandwidth allocation strategies between devices and the server through mathematical modeling and solving the task scheduling problem, minimizing the overall training time. 3. **Improving Training Speed**: Experimental results show that EdgeSplit significantly improves training speed compared to other baseline methods, achieving up to 5.5 times acceleration on the ResNet50 model without loss of accuracy. Through these technical means, EdgeSplit enables efficient training of large-scale models on resource-constrained edge devices and significantly reduces the total training time by offloading part of the computational tasks to more powerful Federated Learning (FL) servers.

Resource-efficient Parallel Split Learning in Heterogeneous Edge Computing

Extendable Multi-Device Collaborative Pipeline Parallel Inference in the Edge-Cloud Scenario

Efficient Parallel Split Learning over Resource-constrained Wireless Edge Networks

Split Federated Learning Over Heterogeneous Edge Devices: Algorithm and Optimization

AccEPT: an Acceleration Scheme for Speeding Up Edge Pipeline-parallel Training

EdgeSP: Scalable Multi-device Parallel DNN Inference on Heterogeneous Edge Clusters

Asteroid: Resource-Efficient Hybrid Pipeline Parallelism for Collaborative DNN Training on Heterogeneous Edge Devices

An Efficient Split Learning Framework for Recurrent Neural Network in Mobile Edge Environment.

Decentralized Proactive Model Offloading and Resource Allocation for Split and Federated Learning

Split Learning Over Wireless Networks: Parallel Design and Resource Management

SplitPlace: AI Augmented Splitting and Placement of Large-Scale Neural Networks in Mobile Edge Environments

Adaptive Partitioning and Efficient Scheduling for Distributed DNN Training in Heterogeneous IoT Environment

DynaSplit: A Hardware-Software Co-Design Framework for Energy-Aware Inference on Edge

Towards Efficient Edge Learning for Large Models in Heterogeneous Resource-limited Environments.

Auto-Split: A General Framework of Collaborative Edge-Cloud AI

Edge–IoT Computing and Networking Resource Allocation for Decomposable Deep Learning Inference

ParallelSFL: A Novel Split Federated Learning Framework Tackling Heterogeneity Issues

Towards Resource-aware DNN Partitioning for Edge Devices with Heterogeneous Resources

MTL-Split: Multi-Task Learning for Edge Devices using Split Computing

Federated Split Learning for Edge Intelligence in Resource-Constrained Wireless Networks

CoEdge: Cooperative DNN Inference With Adaptive Workload Partitioning Over Heterogeneous Edge Devices