Monitoring-Based Task Scheduling In Large-Scale Saas Cloud

Puheng Zhang,Chuang Lin,Xiao Ma,Fengyuan Ren,Wenzhuo Li
DOI: https://doi.org/10.1007/978-3-319-46295-0_9
2016-01-01
Abstract:With the increasing scale of SaaS and the continuous growth in server failures, task scheduling problems become more intricate, and both scheduling quality and scheduling speed raise further concerns. In this paper, we first propose a virtualized and monitoring SaaS model with predictive maintenance to minimize the costs of fault tolerance. Then with the monitored and predicted available states of servers, we focus on dynamic real-time task scheduling in large-scale heterogeneous SaaS, targeting at jointly optimizing the long-term performance benefits and energy costs in order to improve scheduling quality. We formulate a dynamic programming problem, where both the state and action spaces are too large to be solved by simple iterations. To address these issues, we take advantage of Machine Learning theory, and put forward an approximate dynamic programming algorithm. We utilize value function approximation and candidate-heuristic method to separately solve state and action explosions. Thus, computation complexity is significantly reduced and scheduling speed is greatly enhanced. Finally, we conduct experiments with both random simulation data and Google cloud trace-logs. Qos evaluations and comparisons demonstrate that our approach is effective and efficient under bursty requests and high throughputs.
What problem does this paper attempt to address?