A Simulator for Data-Intensive Job Scheduling

Matteo Dell'Amico
DOI: https://doi.org/10.48550/arXiv.1306.6023
2013-08-21
Abstract:Despite the fact that size-based schedulers can give excellent results in terms of both average response times and fairness, data-intensive computing execution engines generally do not employ size-based schedulers, mainly because of the fact that job size is not known a priori. In this work, we perform a simulation-based analysis of the performance of size-based schedulers when they are employed with the workload of typical data-intensive schedules and with approximated size estimations. We show results that are very promising: even when size estimation is very imprecise, response times of size-based schedulers can be definitely smaller than those of simple scheduling techniques such as processor sharing or FIFO.
Distributed, Parallel, and Cluster Computing
What problem does this paper attempt to address?