Learned Bloom Filter for Multi-key Membership Testing

Yunchuan Li,Ziwei Wang,Ruixin Yang,Yan Zhao,Rui Zhou,Kai Zheng
DOI: https://doi.org/10.1007/978-3-031-30637-2_5
2023-01-01
Abstract:Multi-key membership testing refers to checking whether a queried element exists in a given set of multi-key elements, which is a fundamental operation for computing systems and networking applications such as web search, mail systems, distributed databases, firewalls, and network routing. Most existing studies for membership testing are built on Bloom filter, a space-efficient and high-security probabilistic data structure. However, traditional Bloom filter always performs poorly in multi-key scenarios. Recently, a new variant of Bloom filter that has combined machine learning methods and Bloom filter, also known as Learned Bloom Filter (LBF), has drawn increasing attention for its significant improvements in reducing space occupation and False Positive Rate (FPR). More importantly, due to the introduction of the learned model, LBF can well address some problems of Bloom filter in multi-key scenarios. Because of this, we propose a Multi-key LBF (MLBF) data structure, which contains a value-interaction-based multi-key classifier and a multi-key Bloom filter. To reduce FPR, we further propose an Interval-based MLBF, which divides keys into specific intervals according to the data distribution. Extensive experiments based on two real datasets confirm the superiority of the proposed data structures in terms of FPR and query efficiency.
What problem does this paper attempt to address?