DPS: Dynamic Pricing and Scheduling for Distributed Machine Learning Jobs in Edge-Cloud Networks

Ruiting Zhou,Ne Wang,Yifeng Huang,Jinlong Pang,Hao Chen
DOI: https://doi.org/10.1109/tmc.2022.3195765
IF: 6.075
2022-01-01
IEEE Transactions on Mobile Computing
Abstract:5G and Internet of Things stimulate smart applications of edge computing, such as autonomous driving and smart city. As edge computing power increases, more and more machine learning (ML) jobs will be trained in the edge-cloud network, adopting the parameter server (PS) architecture. Due to the distinct features of the edge (low-latency and the scarcity of resources), the cloud (high delay and rich computing capacity) and ML jobs (frequent communication between workers and PSs and unfixed runtime), existing cloud job pricing and scheduling algorithms are not applicable. Therefore, how to price, deploy and schedule ML jobs in the edge-cloud network becomes a challenging problem. To solve it, we propose an auction-based online framework DPS. DPS consists of three major parts: job admission control, price function design and scheduling orchestrator. DPS dynamically prices workers and PSs based on historical job information and real-time system status, and decides whether to accept the job according to the deployment cost. DPS then deploys and schedules accepted ML jobs to pursue the maximum social welfare. Through theoretical analysis, we prove that DPS can achieve a good competition ratio and truthfulness in polynomial time. Large-scale simulations and testbed experiments show that DPS can improve social welfare by at least $95\%$, compared with benchmark algorithms in today's cloud system.
computer science, information systems,telecommunications
What problem does this paper attempt to address?