An enforcement of real time scheduling in Spark Streaming.

Xinyi Liao,Zhiwei Gao,Weixing Ji,Yizhuo Wang
DOI: https://doi.org/10.1109/IGCC.2015.7393730
2015-01-01
Abstract:With the exponential growth in continuous data streams, real time streaming processing has been gaining a lot of popularity. Spark Streaming is one of the open source frameworks for reliable, high-throughput and low latency stream processing. Though it is a near real time stream processing framework running on commodity hardware, real time event processing is not guaranteed in its scheduling system. Profiling results indicate that the total delay time of events with unstable inputs is more volatile and presents big fluctuations. In this paper, we propose a simple, yet effective scheduling strategy to reduce the worst case event processing time by dynamic adjusting the time window of batch intervals. It is a real time enhancement to Spark Streaming based on Spark's framework. The proposed strategy is evaluated using two streaming benchmarks and our preliminary results demonstrate the feasibility of our approach with unstable event streams.
What problem does this paper attempt to address?