A Fast Algorithm for Approximate Quantiles in High Speed Data Streams

Qi Zhang,Wei Wang
DOI: https://doi.org/10.1109/SSDBM.2007.27
2007-01-01
Abstract:We present a fast algorithm for computing approx- imate quantiles in high speed data streams with deter- ministic error bounds. For data streams of size N where N is unknown in advance, our algorithm par- titions the stream into sub-streams of exponentially increasing size as they arrive. For each sub-stream which has a fixed size, we compute and maintain a multi-level summary structure using a novel algorithm. In order to achieve high speed performance, the algo- rithm uses simple block-wise merge and sample oper- ations. Overall, our algorithms for fixed-size streams and arbitrary-size streams have a computational cost of O(N log( \frac{1} { \in } log \in N)) and an average per-element update cost of O(log logN) if \in is fixed.
What problem does this paper attempt to address?