Hadoop in Low-Power Processors

Da Zheng,Alexander Szalay,Andreas Terzis
DOI: https://doi.org/10.48550/arXiv.1408.2284
2014-08-11
Abstract:In our previous work we introduced a so-called Amdahl blade microserver that combines a low-power Atom processor, with a GPU and an SSD to provide a balanced and energy-efficient system. Our preliminary results suggested that the sequential I/O of Amdahl blades can be ten times higher than that a cluster of conventional servers with comparable power consumption. In this paper we investigate the performance and energy efficiency of Amdahl blades running Hadoop. Our results show that Amdahl blades are 7.7 times and 3.4 times as energy-efficient as the Open Cloud Consortium cluster for a data-intensive and a compute-intensive application, respectively. The Hadoop Distributed Filesystem has relatively poor performance on Amdahl blades because both disk and network I/O are CPU-heavy operations on Atom processors. We demonstrate three effective techniques to reduce CPU consumption and improve performance. However, even with these improvements, the Atom processor is still the system's bottleneck. We revisit Amdahl's law, and estimate that Amdahl blades need four Atom cores to be well balanced for Hadoop tasks.
Distributed, Parallel, and Cluster Computing
What problem does this paper attempt to address?