Taming Serverless Cold Start of Cloud Model Inference With Edge Computing
Zhi Zhou,Shen Cai,Kongyange Zhao,Fei Xu,Xu Chen,Lei Jiao
DOI: https://doi.org/10.1109/TMC.2023.3348165
IF: 6.075
2024-08-01
IEEE Transactions on Mobile Computing
Abstract:Serverless computing is envisioned as the de-facto standard for next-generation cloud computing. However, the cold start dilemma has impeded its adoption by delay-sensitive and burst applications. In this paper, we propose to tame serverless cold start in a cloud inference system with edge computing. Specifically, the proposed solution smooths the serverless cloud workload with user-owned edge computing, reducing the number of cold starts. Leveraging the configurability of requests and serverless functions, the proposed solution further reduces the transmission latency and serverless cost by adapting request configuration (e.g., image resolution) and function configuration (e.g., memory). To alleviate the potential inference accuracy degradation incurred by configuration adaption, we aim to strike a nice balance between inference latency, cost, and accuracy. However, achieving this goal is non-trivial since the underlying optimization is non-convex and involves future uncertain information. To simultaneously address dual challenges, the presented cold-start-aware online algorithms apply the regularization technique to decompose the problem into separate convex subproblems. Then, it applies lazy switching to smooth the number of provisioned functions and thus reduces the cold start. Through rigorous theoretical analysis, realistic prototype evaluations on AWS Lambda, and trace-driven simulations, we comprehensively validate the theoretical and empirical performance of our proposed solution.
Engineering,Computer Science