Efficient Online Coflow Routing And Scheduling

Yupeng Li,Shaofeng H. -C. Jiang,Haisheng Tan,Chenzi Zhang,Guihai Chen,Jipeng Zhou,Francis C. M. Lau
DOI: https://doi.org/10.1145/2942358.2942367
2016-01-01
Abstract:A coflow is a collection of related parallel flows that occur typically between two stages of a multi-stage compute task in a network, such as shuffle flows in MapReduce. The coflow abstraction allows applications to convey their semantics to the network so that application-level requirements (e.g., minimizing the completion time of the slowest flow) can be better satisfied. In this paper, we study the routing and scheduling of multiple coflows to minimize the average coflow completion time (CCT). We first propose a rounding-based randomized approximation algorithm, called OneCoflow, for single coflow routing and scheduling. The multiple coflow problem is more challenging as coexisting coflows will compete for the same network resources such as link bandwidths. To minimize the average CCT, we derive an online multiple coflow routing and scheduling algorithm, called OMCoflow, and prove that it has a reasonably good competitive ratio. To the best of our knowledge, this is the first online algorithm with theoretical performance guarantees which considers routing and scheduling simultaneously for multi-coflows. Compared with existing methods, OMCoflow runs more efficiently, and it avoids the problem of frequently rerouting the flows. Extensive simulations on a Facebook data trace show that OMCoflow outperforms the state-of-the-art heuristic schemes significantly (e.g., reducing the average CCT by up to 41.8% and the execution time by up to 99.2% against RAPIER [28]).
What problem does this paper attempt to address?