Using Paralleled-PEs Method to Resolve the Bursting Data in Distributed Stream Processing System

Baojian Zhou,Zhongzhi Luan,Jieqian Wu,Ming Xie
DOI: https://doi.org/10.1109/CSE.2013.196
2013-01-01
Abstract:The distributed stream processing applications (DSPA) are the applications which can process real-time data in the distributed environment. These applications contain processing elements (PE) and different kinds of streams in the cluster or cloud environment. However, this kind of applications maybe have hotspots and bursting data in a very short of time when they running in the cluster or cloud. Furthermore, the resources such as memory and CPU power in each node of an environment can lead to resources imbalance because these applications always run in a long time. Considering two problems above, in this paper, we proposed a paralleled-PEs method to relieve the hotspots and bursting data in DSPA. We also approach a dynamic and situational awareness method which can combine with paralleled-PEs method to relieve imbalance situations and improve the utilization of resource. We use some shell scripts and java packages to implement our methods. In our experiments, we deploy these two methods in the S4 system framework which is an open source stream computation platform. And in the experiments evaluation, the results show that our methods have a lower delay, better resources utilization. The results also show that our methods have higher throughput than other methods.
What problem does this paper attempt to address?