Auto Scaling Virtual Machines for Web Applications with Queueing Theory
Gaopan Huang,Songyun Wang,Mingming Zhang,Yefei Li,Zhuzhong Qian,Yuan Chen,Sheng Zhang
DOI: https://doi.org/10.1109/icsai.2016.7810994
2016-01-01
Abstract:With the rapid development of cloud computing in recent years, more and more individuals and corporations use cloud computing platform to deploy their web applications, which can significantly minimize their deployment costs. However, it is observed that the number of accesses to some web application often fluctuates over time, resulting in the so-called peak-valley phenomenon: the amount of reserved resources is often proportional to the peak need of physical resources, while most of the time the amount of required resources is far below the peak load and thus physical servers will be idle for most of the time. To solve this problem, we establish a queuing model M/M/C, which represents infinite source and multi-service window. Based on this queueing model, we can accurately predict the arrival time of each customer, which enables us to calculate the minimum amount of resources that meet the resource needs. Then, we use heuristic algorithms and dynamic programming method to design a Virtual Machine (VM) auto-scaling strategies, including horizontal scaling and vertical scaling. With the proposed model and scaling algorithms, we can make web applications not only meet customer needs, but also use the least amount of resources, improving the resource utilization and minimizing deployment costs. With extensive experiments, we show the proposed model and scaling algorithms can greatly improve resource utilization without sacrificing web application performance.