MapReduce Job Optimization: A Mapping Study

Qinghua Lu,Liming Zhu,He Zhang,Dongyao Wu,Zheng Li,Xiwei Xu
DOI: https://doi.org/10.1109/ccbd.2015.33
2015-01-01
Abstract:MapReduce has become the standard model for supporting big data analytics. In particular, MapReduce job optimization has been widely considered to be crucial in the implementations of big data analytics. However, there is still a lack of guidelines especially for practitioners to understand how the MapReduce jobs can be optimized. This paper aims to systematic identify and taxonomically classify the existing work on job optimization. We conducted a mapping study on 47 selected papers that were published between 2004 and 2014. We classified and compared the selected papers based on a 5WH-based characterization framework. This study generates a knowledge base of current job optimization solutions and also identifies a set of research gaps and opportunities. This study concludes that job optimization is still in an early stage of maturity. More attentions need to be paid to the cross-data center, cluster or rack job optimization to improve communication efficiency.
What problem does this paper attempt to address?