Probe Sharing: A Simple Technique to Improve on Sparrow

Wenzhuo Li,Chuang Lin,Puheng Zhang,Mao Miao
DOI: https://doi.org/10.1109/iscc.2017.8024635
2017-01-01
Abstract:As big data analytics frameworks are developing towards larger degrees of parallelism and shorter task durations to provide lower latency, millions of scheduling decisions per second pose a great challenge to centralized schedulers. Therefore, increasing efforts are devoted to the study of distributed scheduling approaches to avoid the throughput limitation of centralized designs. Among these approaches, Sparrow is a leading design. However, due to Sparrow's sample-based techniques, some tasks in subsequent jobs may be scheduled earlier than those in the head-of-line job, which results in scheduling disorder and inevitably causes poor response times and unfairness. To address these problems, this paper proposes a simple algorithm called probe sharing: jobs that arrive at the same Sparrow scheduler can share their probes to ensure that all tasks in the head-of-line job can be scheduled earlier than subsequent jobs. We have performed theoretical analysis and proved that probe sharing makes a good improvement on Sparrow. We have implemented probe sharing in Sparrow and shown that probe sharing reduces scheduling delays by 2.2× and provides 100% fairness. Trace-driven simulations have been also used to evaluate probe sharing when scaling to large clusters. In addition, the simplicity of probe sharing makes it applicable to many schedulers that use Sparrow's techniques (e.g., Hopper, Tarcil and Eagle).
What problem does this paper attempt to address?