Ace-Sniper: Cloud-Edge Collaborative Scheduling Framework With DNN Inference Latency Modeling on Heterogeneous Devices
Weihong Liu,Jiawei Geng,Zongwei Zhu,Yang Zhao,Cheng Ji,Changlong Li,Zirui Lian,Xuehai Zhou
DOI: https://doi.org/10.1109/tcad.2023.3314388
IF: 2.9
2023-01-01
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Abstract:The cloud-edge collaborative inference requires efficient scheduling of artificial intelligence (AI) tasks to the appropriate edge intelligence devices. Deep neural network (DNN) inference latency has become a vital basis for improving scheduling efficiency. However, edge devices exhibit highly heterogeneous due to the differences in hardware architectures, computing power, etc. Meanwhile, the diverse DNNs are continuing to iterate over time. The diversity of devices and DNNs introduces high computational costs for measurement methods, while invasive prediction methods face significant development efforts and application limitations. In this paper, we propose and develop Ace-Sniper, a scheduling framework with DNN inference latency modeling on heterogeneous devices. First, to address the device heterogeneity, a unified hardware resource modeling (HRM) is designed by considering the platforms as black-box functions that output feature vectors. Second, neural network similarity (NNS) is introduced for feature extraction of diverse and frequently iterated DNNs. Finally, with the results of HRM and NNS as input, the performance characterization network is designed to predict the latencies of the given unseen DNNs on heterogeneous devices, which can be combined into most time-based scheduling algorithms. Experimental results show that the average relative error of DNN inference latency prediction is 11.11%, and the prediction accuracy reaches 93.2%. Compared with the non-time-aware scheduling methods, the average waiting time for tasks is reduced by 82.95%, and the platform throughput is improved by 63% on average.
engineering, electrical & electronic,computer science, interdisciplinary applications, hardware & architecture