Virtual self-adaptive bitmap for online cardinality estimation

Jie Lu,Hongchang Chen,Jianpeng Zhang,Tao Hu,Penghao Sun,Zhen Zhang
DOI: https://doi.org/10.1016/j.is.2022.102160
IF: 3.18
2022-12-23
Information Systems
Abstract:Cardinality estimation is the task of obtaining the number of distinct items in a data stream, which plays an important role in many application domains. However, when dealing with high-speed data streams, it remains a significant challenge to estimate cardinality considering record/query overhead and memory efficiency. This paper proposes a virtual self-adaptive bitmap estimator to support online cardinality estimation, which reduces the record overhead to one hash per item for the first time. By logically adding virtual bits, our estimator automatically adapts its sampling probability to different stream sizes. We evaluate the virtual self-adaptive bitmap theoretically and experimentally. The experimental results show that our estimator significantly improves over the existing work in terms of record throughput, query throughput and estimation accuracy.
computer science, information systems
What problem does this paper attempt to address?