Using Adaptive Resource Allocation to Implement an Elastic MapReduce Framework

Jiaqi Zhao,Changlong Xue,Xinlin Tao,Shugong Zhang,Jie Tao
DOI: https://doi.org/10.1002/spe.2398
2016-01-01
Software Practice and Experience
Abstract:SummaryToday, we are observing a transition of science paradigms from the computational science to data‐intensive science. With the exponential increase of input and intermediate data, more applications are developed using the MapReduce programming model, which is regarded as an appropriate programming model for analysing large data sets. A MapReduce framework runs its applications on a cluster, where the computing capacity allocated to the applications is limited and may not fill their runtime resource demand. In this case, the Map/Reduce tasks have to wait in a queues, and the applications suffer from a poor performance. This work develops an autonomic resource manager within the Hadoop MapReduce framework. The manager is capable of getting aware of the overloading or under‐loading situations with the resources allocated to its user community. For the former, it takes an action of requesting more resources from, for example, the batch system of a High Performance Computing (HPC) cluster or Computing Clouds and integrates the additional resources, in case of acquisition, into the Hadoop MapReduce runtime. For the latter, the manager gives the free resources back to its source. We extended the existing Hadoop MapReduce resource manager to implement the proposed strategy and validated the concept on an HPC cluster with standard benchmark applications. Experimental results show a significant performance gain, for example, an up to 45% improvement in execution time for running multiple applications. Copyright © 2016 John Wiley & Sons, Ltd.
What problem does this paper attempt to address?