A QoS-Aware Load Balancing Policy in Multi-tenancy Environment

Hailong Sun,Tao Zhao,Yu Tang,Xudong Liu
DOI: https://doi.org/10.1109/SOSE.2014.21
2014-01-01
Abstract:Cloud computing aims at providing services on the basis of a shared pool of underpinning resources and load balancing is of paramount importance in such an environment. At the same time, multi-tenancy is widely adopted in cloud computing to reduce the costs of service provisioning and to improve resource utilization. Multi-tenancy brings new challenges to load balancing, since it incurs resource competition and different QoS requirements of hosted applications. Therefore, servers with multiple deployed applications need a proper request scheduling policy to guarantee their quality of service, e.g., response time. However, most of the QoS-aware load balancing algorithms do not concern about the mutual intervention among applications deployed on the same server. When under heavy loads, mean response time of some applications may become too high to be acceptable. In this work, we propose a new load balancing algorithm, "Server Throughput Restriction(STR)", based on M/G/s/s+r queueing model, in order to guarantee each application's mean response time and also achieve better server throughput. In addition, we conduct several experiments to analyze the performance of STR in comparison with Round-Robin and Least-Work-Remaining.
What problem does this paper attempt to address?