A real-time top-k query algorithm and parallelized implementation

yao lu,jun liu
DOI: https://doi.org/10.1109/CCIS.2014.7175722
2014-01-01
Abstract:The analysis of data streams is of great value in many fields such as network monitoring and sensor instrumentation. As a common operation, top-k query over data stream is the basis and core of other problems in data stream analysis. In this paper, we introduce a parallel algorithm based on Frequent algorithm and implement it by utilizing Apache Storm. Further, we evaluate the algorithm by estimated error under various situations and show that the algorithm can effectively improve the precision of top-k query by adjusting the parallel degree. The parallelized implementation is of significance in network traffic monitoring.
What problem does this paper attempt to address?