Wormhole Filters: Caching Your Hash on Persistent Memory.
Hancheng Wang,Haipeng Dai,Rong Gu,Youyou Lu,Jiaqi Zheng,Jingsong Dai,Shusen Chen,Zhiyuan Chen,Shuaituan Li,Guihai Chen
DOI: https://doi.org/10.1145/3627703.3629590
2024-01-01
Abstract:Approximate membership query (AMQ) data structures can approximately determine whether an element is in the set with high efficiency. They are widely used in distributed systems, database systems, bioinformatics, IoT applications, data stream mining, etc. However, the memory consumption of AMQ data structures grows rapidly as the data scale grows, which limits the system's ability to process a massive amount of data. The emerging persistent memory provides a close-to-DRAM access speed and terabyte-level capacity, facilitating AMQ data structures to handle massive data. Nevertheless, existing AMQ data structures perform poorly on persistent memory due to intensive random accesses and/or sequential writes. Therefore, we propose a novel AMQ data structure called wormhole filter, which achieves high performance on persistent memory by reducing random accesses and sequential writes. In addition, we reduce the number of log records for lower recovery overhead. Theoretical analysis and experimental results show that wormhole filters significantly outperform competitive state-of-the-art AMQ data structures. For example, wormhole filters achieve 23.26× insertion throughput, 1.98× positive lookup throughput, and 8.82× deletion throughput of the best competing baseline.