RoBF: an Auto-Tuning Bloom Filter for Mixed Queries on LSM-Tree.

Ruicheng Liu,Peiquan Jin,Shouhong Wan,Bei Hua
DOI: https://doi.org/10.18293/seke2021-142
2021-01-01
Abstract:Bloom filter is an efficient technique to improve query performance in LSM-tree-based databases, such as RocksDB, HBase, and Cassandra.However, the original Bloom filter uses a fixed false positive rate (FPR), which makes it inefficient for mixed queries that involve both point and range queries.To solve this problem, in this paper, we present an improved Bloom filter called RoBF (Range-Query-Oriented Bloom Filter), which uses a mixture of Bloom filters and can process mixed queries on LSM-tree efficiently.We design an efficient algorithm for generating the solution based on the query distribution.We compare our proposal with the trie-based filter and find out that each has its own advantages for various scenarios.Therefore, we propose to use different filters with varied sizes for different levels on LSM-tree.Following this idea, we present an algorithm to generate specific filters with a specific size for different levels on LSM-tree to optimize the performance of mixed queries under limited memory space.We conduct comparative experiments and compare the proposed RoBF with various competitors, and the results show that RoBF can improve the performance of evaluating mixed queries by up to 6x to 30x, compared to the original Bloom filter in RocksDB.
What problem does this paper attempt to address?