Needle in a Haystack: Max/Min Online Aggregation in the Cloud

Xiang Ci,Fengming Wang,Xiaofeng Meng
DOI: https://doi.org/10.1007/978-3-319-22324-7_21
2015-01-01
Abstract:As the development of social network, mobile Internet, etc., an increasing amount of data are being generated, which beyond the processing ability of traditional data management tools. In many real-life applications, users can accept approximate answers accompanied by accuracy guarantees. One of the most commonly used approaches of approximate query processing is online aggregation. Most existing work of online aggregation in the cloud focuses on the aggregation functions such as Count, Sum and Avg, while there is little work on the Max/Min online aggregation in the cloud now. In this paper, we measure the accuracy of Max/Min online aggregation by using quantile which is deduced by Chebyshev’s inequality and central limit theorem. We implement our methods in a cloud online aggregation system called COLA and the experimental results demonstrate our method can deliver reasonable online Max/Min estimates within an acceptable time period.
What problem does this paper attempt to address?