LadderFilter: Filtering Infrequent Items with Small Memory and Time Overhead

Yuanpeng Li,Feiyu Wang,Xiang Yu,Yilong Yang,Kaicheng Yang,Tong Yang,Zhuo Ma,Bin Cui,Steve Uhlig
DOI: https://doi.org/10.1145/3588690
2023-01-01
Abstract:Data stream processing is critical in streaming databases. Existing works pay a lot of attention to frequent items. To improve the accuracy for frequent items, existing solutions focus on accurately filtering infrequent items. While these solutions are effective, they keep track of all infrequent items and require multiple hash computations and memory accesses. This increases memory and time overhead. To reduce this overhead, we propose LadderFilter, which candiscard infrequent items efficiently in terms of both memory and time. To achieve memory efficiency, LadderFilter discards (approximately) infrequent items using multiple LRU queues. To achieve time efficiency, we leverage SIMD instructions to implement LRU policy without timestamps. We apply LadderFilter to four types of sketches. Our experimental results show that LadderFilter improves the accuracy by up to 60.6×, and the throughput by up to 1.37×, and can maintain high accuracy with small memory usage. All related code is provided open-source at Github.
What problem does this paper attempt to address?