Network-Aware Grouping in Distributed Stream Processing Systems.

Fei Chen,Song Wu,Hai Jin
DOI: https://doi.org/10.1007/978-3-030-05051-1_1
2018-01-01
Abstract:Distributed Stream Processing (DSP) systems have recently attracted much attention because of their ability to process huge volumes of real-time stream data with very low latency on clusters of commodity hardware. Existing workload grouping strategies in a DSP system can be classified into four categories (i.e. raw and blind, data skewness, cluster heterogeneity, and dynamic load-aware). However, these traditional stream grouping strategies do not consider network distance between two communicating operators. In fact, the traffic from different network channels makes a significant impact on performance. How to grouping tuples according to network distances to improve performance has been a critical problem.
What problem does this paper attempt to address?