Niagara: Scheduling DNN Inference Services on Heterogeneous Edge Processors.

Daliang Xu,Qing Li,Mengwei Xu,Kang Huang,Gang Huang,Shangguang Wang,Xin Jin,Yun Ma,Xuanzhe Liu
DOI: https://doi.org/10.1007/978-3-031-48421-6_6
2023-01-01
Abstract:Intelligent applications heavily rely on deep neural network (DNN) inference services executed on edge devices to fulfill functional prerequisites while safeguarding user data privacy. However, the execution of such DNN services on resource-constrained edge devices poses a significant challenge: low throughput of inference tasks. To this end, this paper proposes Niagara, a novel system designed to maximize system throughput by judiciously scheduling DNN inference services on heterogeneous processors available on edge devices. Niagara faces two critical challenges: uncertain workload dynamics and high scheduling complexity. To effectively address these challenges, Niagara employs a predictive model to anticipate incoming workload patterns and orchestrates the allocation of services across heterogeneous processors through a combination of offline scheduling optimization and online service dispatching strategies. We have implemented Niagara and conducted thorough experiments. The results demonstrate that Niagara surpasses state-of-the-art approaches by elevating DNN inference throughput by up to 4.67x, all while satisfying the same stringent inference latency requirements. Furthermore, Niagara has been successfully deployed in realworld power supply substations to detect violations, ensuring uninterrupted, accident-free operation during its six-month deployment period.
What problem does this paper attempt to address?