Cormorant: Running Analytic Queries on MapReduce with Collaborative Software-Defined Networking

Pengcheng Xiong,Xin He,Hakan Hacigumus,Prashant Shenoy
DOI: https://doi.org/10.1109/hotweb.2015.10
2015-01-01
Abstract:MapReduce is a popular choice for executing analytic workloads over large datasets on clusters of commodity machines. Due to the distributed nature of such systems, network resource bottlenecks can adversely affect performance, especially when multiple applications share the network. One of the significant barriers to reducing the occurrence and impact of such bottlenecks is the current separation between MapReduce and network management and routing. Fortunately, the emergence of software-defined networking (SDN) is removing the barriers to cooperation between Hadoop and the network. To explore the opportunity this creates, we focus on how we can use the capabilities of SDN to create a more collaborative relationship between MapReduce and the network underneath. We demonstrate the effectiveness of this collaboration through the implementation of and experiments with a system we call Cormorant. Experimental results show up to 38% improvement for analytic query performance, beyond the benefits achievable by independently optimizing MapReduce schedulers and network flow schedulers.
What problem does this paper attempt to address?