Workload-Aware Scheduling Across Geo-Distributed Data Centers

Yibo Jin,Yuan Gao,Zhuzhong Qian,Mingyu Zhai,Hui Peng,Sanglu Lu
DOI: https://doi.org/10.1109/trustcom.2016.0228
2016-01-01
Abstract:As the rapid development of big data applications, more and more data analytics are based on geographically distributed data centers. Recent works mainly focus on task and data placement to reduce data transmission among these geo-distributed data centers. In this paper, we argue that the task execution delay may also impact the response time, especially in the hot-spot data centers. We define geo-distributed workload-aware scheduling problem, aiming to minimize the overall delay of data transmission and task execution. And then, we prove it to be NP-complete and propose an on-line heuristic to effectively re-distribute dataset and tasks, which potentially balances the workload among data centers and optimizes the overall response time. Experiments show that our algorithm has a significant performance improvement which covers wide range of data distribution, and could reduce up to 55% job response time on average.
What problem does this paper attempt to address?