An Adaptive Task Migration Scheduling Approach for Edge-Cloud Collaborative Inference
Boyin Zhang,Yinggang Li,Shigeng Zhang,Yue Zhang,Bing Zhu
DOI: https://doi.org/10.1155/2022/8804530
2022-01-01
Wireless Communications and Mobile Computing
Abstract:Deep Neural Network (DNN) models have achieved excellent performance in many inference tasks and have been widely used in many intelligent applications. However, DNN models often require a lot of computational resources to complete the inference tasks, which hinders the deployment of such models to resource-constrained edge devices. In order to extend the application scenarios of DNN models, the edge-cloud collaborative inference methods, represented by model partition, have attracted much research attention in recent years. In scenarios that have multiple edge devices deployed, the edge-cloud collaborative inference method requires partial migration of tasks, but traditional scheduling methods only migrate tasks at the task level. In this paper, we propose two task scheduling methods, which can solve the problem of partial migration of tasks in multiedge scenarios. The first scheduling method is based on the optimal cutting of a single DNN. The cutting positions of all the models are the same, regardless of the influence of external factors. This method is suitable for chain and directed acyclic graph- (DAG-) type DNNs. The second scheduling method takes external factors such as congestion and queuing delay at the cloud side into consideration, which dynamically selects the cutting position of each DNN to optimize the overall delay and thus is applicable to chain DNN models. The experimental results show that, compared with the baseline method, our proposed scheduling method can reduce the delay by up to 6.48x.