A Deadline-Aware Coflow Scheduling Approach for Big Data Applications.

Wenda Tang,Song Wang,Duanchao Li,Taigui Huang,Wanchun Dou,Shui Yu
DOI: https://doi.org/10.1109/icc.2018.8422563
2018-01-01
Abstract:Many datacenters usually process complex jobs such as MapReduce jobs. From a network perspective, most of these jobs trigger multiple parallel data flows, which comprise a coflow group semantically. When to schedule the jobs in datacenter or across multiple datacenters, most of current job schedulers have not considered the underlying network traffic load, which is suboptimal for jobs completion times. We present a new deadline-aware coflow scheduling approach called DCS, which takes the underlying network traffic load into consideration while guaranteeing high percentage of coflows that meet their deadlines. DCS aims to alleviate the network congestion in datacenters whose network worload are unbalanced, and it includes two stages for coflow scheduling: Firstly, it generates the task placement proposal by considering the underlying network workload. Secondly, it makes scheduling decision by estimating both task's execution time and transmission waiting time under the previous task placement proposal. The real-world data based simulation results have shown that DCS outperforms all existing solutions on reducing the percentage of coflows that miss their deadlines.
What problem does this paper attempt to address?