An On-the-Fly Scheduling Strategy for Distributed Stream Processing Platform.

Wen'an Wang,Chuang Zhang,Xiaojun Chen,Zhao Li,Hong Ding,Xin Wen
DOI: https://doi.org/10.1109/bdcloud.2018.00116
2018-01-01
Abstract:Distributed stream processing can accomplish real-time processing of continuous streaming big data to obtain valuable information with high velocity. To maintain continuously stable and efficient running of stream applications, however, continuous online scheduling operations are required in the context of highly dynamic data stream. For this reason, this paper proposes the on-the-fly scheduling strategy in a distributed stream processing environment, which dynamically predicts abnormal events through double exponential smoothing and adopts trafficaware active migration protocol to adjust the network routing structure on-the-fly to balance the inter-worker load. Moreover, an evaluation method is proposed to quantitatively analyze the various scheduling objectives. Finally, we commendably apply the scheduling strategy to a stream processing platform, which regards docker instance as basic scheduling units. Meanwhile, based on the platform and the evaluation method, we complete performance comparison experiments of the scheduling algorithm. The experimental results indicate that our algorithm has excellent performance in throughput of topology, average processing time and balance of task load, which is suitable for deployment in a distributed environment with large-scale nodes and tasks.
What problem does this paper attempt to address?