On-demand Edge Inference Scheduling with Accuracy and Deadline Guarantee.

Yechao She,Minming Li,Yang Jin,Meng Xu,Jianping Wang,Bin Liu
DOI: https://doi.org/10.1109/iwqos57198.2023.10188769
2023-01-01
Abstract:To meet increasing demands for machine-learning-based applications, pushing inference services to the network edge has been a trend. This work aims to design an on-demand edge inference scheduler with accuracy and deadline guarantee for repetitive tasks. Specifically, we consider an edge server that is preinstalled with multiple early-exit Deep Neural Networks (DNNs), and each DNN-exit pair can provide inference service of different quality. We also consider tasks' diversity in quality of service requirements and related utility. We aim to maximize the system's total utility by optimizing service assignment and time scheduling subject to resource, accuracy, and deadline constraints. We present this problem's integer linear problem formulation and show this problem is NP-hard even for the offline case. This problem is challenging due to the coupled effect of service assignment and time scheduling. To derive low-complexity scheduling solutions, we introduce a task-service graph and convert this problem into a service assignment selection problem with schedulability constraints. Then, we design a polynomial complexity algorithm with $\frac{\rho}{\delta}$ -approximation ratio for the offline problem, with $\rho$ referring to the task-wise utility ratio, $\delta$ referring to the maximum number of concurrent tasks. To handle the online problem, we propose an online heuristic algorithm. Simulation results show that the proposed algorithms outperform the state-of-the-art baseline algorithms.
What problem does this paper attempt to address?