Cluster Scheduler on Heterogeneous Cloud

Xiao Ling,Jiahai Yang,Dan Wang,Ye Wang
DOI: https://doi.org/10.1109/hpcc-css-icess.2015.114
2015-01-01
Abstract:With the increasingly widespread adoption of cloud computing and tenants' growing needs for large-scale data processing, cluster scheduling frameworks (e.g. MapReduce, Spark, etc.) have emerged as important programming models that works for distributed and parallel computing on cloud systems. While several recent researches proposed some solutions to optimize the MapReduce-like scheduler, they hardly consider the significant impact of external factors caused by heterogeneity of cloud systems, especially I/O contention and instance types selection. In this paper, we present a simplified abstraction of cluster scheduling problem and formulate it as an optimization problem. To minimize the overall task weighted completion times, which is NP-complete, we propose a novel 7-approximation heuristic algorithm MRS. By comparing our algorithm with other classical scheduling strategies on Amazon EC2, we demonstrates that MRS consistently outperforms these algorithms under different scenarios.
What problem does this paper attempt to address?