In STechAH: An Autoscaling Scheme for Hadoop in the Private Cloud

Xueying Wang,ZhiHui Lu,Jie Wu,Tong Zhao,Patrick Hung
DOI: https://doi.org/10.1109/SCC.2015.61
2015-01-01
Abstract:Research shows that in many cloud data centers, physical resources are not used efficiently and thereby cost extra overhead. To improve cost-effectiveness of resources in cloud data centers, running big data applications to share residual capacity is a practical solution. However, performance loss brought by resource competition and interference from different types of applications is the main challenge for us. In this paper, we design, implement and evaluate the InSTechAH, an auto scaling scheme for a Hadoop system in a private cloud, which attempts to improve the resource utilization in cloud data centers as well as to maintain required quality of services by auto scaling and scheduling background analytics tasks. In this system, we design the multilayer node model to reduce interference from other services by automatically scaling the clusters according to the auto scale algorithm we introduced. We then build the resource scheduling model which use prediction based scheduling method to reduce the cost brought by scaling. We evaluate our scheme partly on a real data trace and partly on simulation, with Hadoop as the parallel data analytics frameworks and Open Stack as the cloud management architecture, to show the efficiency of InSTechAH system.
What problem does this paper attempt to address?