An Accurate Estimation Algorithm for Big Data Streams.

Qin Xin,Jianping Wu
DOI: https://doi.org/10.1007/s10619-018-7225-5
IF: 0.974
2018-01-01
Distributed and Parallel Databases
Abstract:Sketch is a memory-efficient data structure, and is used to store and query the frequency of any item in a given multiset. As it can achieve fast query and update, it has been applied to various fields. Different sketches have different advantages and disadvantages. Sketches are originally proposed for estimation of flow size in network measurement. The key factor of sketches for network measurement is the insertion speed and accuracy. In this paper, we propose a new sketch, which can significantly improve the insertion speed while improving the accuracy. Our key methods include on-chip/off-chip separation and partial update algorithm. Extensive experimental results show that our sketch significantly outperforms the state-of-the-art both in terms of accuracy and speed.
What problem does this paper attempt to address?