Energy Prediction for MapReduce Workloads
Wenjun Li,Hailong Yang,Zhongzhi Luan,Depei Qian
DOI: https://doi.org/10.1109/DASC.2011.88
2011-01-01
Abstract:Energy efficiency of data centers has attracted wide research attention with growing concern for power consumption and heat dissipation. Map Reduce as an efficient programming model for data-intensive computing is increasingly popular among industrial companies and academic organizations. As Map Reduce is developed specifically to process large-scale data analysis, its impact on energy efficiency of data centers has not been well scrutinized. Recently some energy conserving strategies have been proposed to reduce the overall power consumption of Map Reduce clusters. The fundamental ideas of previous work can be summarized as scaling down working nodes and reducing execution time. However, there are few researches on energy prediction for Map Reduce workloads, which can offer guide for cluster administrator to make power budget or schedule workloads to clusters with different power budget, and be useful for monitoring workloads' energy consumption. In this paper, we identify several workload metrics that have strong correlations with energy consumption. We use multivariate linear regression to analyze these metrics, and then construct a prediction model. Regression diagnosis is performed intensively to optimize the prediction model. After applying to the Word Count and Sort workloads with various input size, we find our prediction model is highly accurate with 0.12% and 0.15% inaccuracy compared to the observed energy consumption in the best and worst cases.
What problem does this paper attempt to address?