Fine-Grained Probability Counting: Refined LogLog Algorithm

Lun Wang,Zekun Cai,Hao Wang,Jie Jiang,Tong Yang,Bin Cui,Xiaoming Li
DOI: https://doi.org/10.1109/bigcomp.2018.00034
2018-01-01
Abstract:Estimating the number of distinct flows, also called the cardinality, is an important issue in many network applications, such as traffic measurement, anomaly detection, etc. The challenging problem is that a high accuracy should be achieved with line speed and small auxiliary memory. The state-of-the-art, LogLog algorithm, uses loglogN max memory, where N max is the priori upper bound for cardinality, and achieves an accuracy of the order of 1/√d, where d is the number of counters. In this paper, we propose a refined version of LogLog algorithm, namely Refined LogLog. It achieves a much better accuracy than the original LogLog algorithm by using more fine-grained common ratios. The algorithm is validated by a detailed analysis. A self-adaptive version, Self-Adaptive LogLog, is also proposed based on Refined LogLog, to adapt to cardinalities of different scales automatically. Our experimental results show that Refined LogLog outperforms LogLog in accuracy by up to 67.0%, and reduces the standard deviation by up to 60.8%.
What problem does this paper attempt to address?