Real-Time Scheduling in MapReduce Clusters

Chen He,Ying Lu,David Swanson
DOI: https://doi.org/10.1109/hpcc.and.euc.2013.216
2013-01-01
Abstract:MapReduce has been widely used as a Big Data processing platform. As it gets popular, its scheduling becomes increasingly important. In particular, since many MapReduce applications require real-time data processing, scheduling real time applications in MapReduce environments has become a significant problem. In this paper, we create a novel real-time scheduler for MapReduce, which overcomes the deficiencies of an existing scheduler. It avoids accepting jobs that will lead to deadline misses and improves the cluster utilization. We implement our scheduler in Hadoop system and experimental results show that our scheduler provides deadline guarantees for accepted jobs and achieves good cluster utilization.
What problem does this paper attempt to address?