Scheduling Generative-AI Job DAGs with Model Serving in Data Centers

Ying Zheng,Lei Jiao,Yuedong Xu,Bo An,Xin Wang,Zongpeng Li
DOI: https://doi.org/10.1109/iwqos61813.2024.10682885
2024-01-01
Abstract:Scheduling generative-AI jobs in the edge computing environment faces multiple non-trivial challenges, including the Directed Acyclic Graph (DAG) dependency among tasks, the intrinsic intertwinement between task scheduling and model selection, and the dynamic unpredictable arrival of job DAGs. In this work, we capture all such challenges and formulate a non-linear integer program to optimize the long-term profit of the generative-AI service provider, i.e., service revenue of the admitted jobs minus system costs of executing the tasks contained in such job DAGs. This problem is NP-hard even in the offline setting. To solve it, we first reformulate it into an equivalent schedule selection problem using generated schedules to tackle complex constraints. Then, we design a new online scheduling method through the online primal-dual technique. Experimental results confirm that our approach can increase the total service profit by up to 41.2% compared to existing algorithms.
What problem does this paper attempt to address?