OFScheduler: A Dynamic Network Optimizer for MapReduce in Heterogeneous Cluster

Zhao Li,Yao Shen,Bin Yao,Minyi Guo
DOI: https://doi.org/10.1007/s10766-013-0281-6
2013-01-01
International Journal of Parallel Programming
Abstract:MapReduce is a popular programming paradigm in cloud computing due to its excellent scalability for processing large-scale data. However, MapReduce performs poorly in heterogeneous clusters. One of the reasons is that Hadoop’s built-in load balancing algorithm for Map function leads to excessive network traffic. We propose a new dynamic network optimizer called OFScheduler for heterogeneous clusters to relieve the network traffic during the execution of MapReduce jobs. The optimizer focuses on reducing bandwith competition, balancing the workload of network links and increasing bandwidth utilization. The proposed optimizer tags different types of traffic and utilize the Openflow to adjust transfers of flows dynamically. We instantiate a simulator and an OpenFlow testbed for evaluation. The simulation results demonstrate that the proposed optimizer has a significant effect on increasing bandwidth utilization and improving the performance of MapReduce by 24 ~ 63 % for most of jobs in a multi-path heterogeneous cluster. The experiment results show that the proposed optimizer can be deployed into a real environment.
What problem does this paper attempt to address?