An efficient algorithm for approximate biased quantile computation in data streams.

Qi Zhang,Wei Wang
DOI: https://doi.org/10.1145/1321440.1321601
2007-01-01
Abstract:We propose an efficient algorithm for approximate biased quantile computation in large data streams. Our algorithm computes decomposable biased quantile summaries on fixed sized blocks and dynamically maintains the biased quantile summary for the entire stream as the exponential histogram over the block-wise quantile summaries. The algorithm is computationally efficient and achieves an amortized computational cost of O(log(1⁄∈log(∈n))) and a space requirement of O(log3∈n↬∈). Our algorithm does not assume prior knowledge of the stream sizes or the range of data values in the streams. In practice, our algorithm is able to efficiently maintain summaries over large data streams with over tens of millions of observations and achieves significant performance improvement over prior algorithms.
What problem does this paper attempt to address?