SplitDB: Closing the Performance Gap for LSM-Tree-Based Key-Value Stores

Miao Cai,Xuzhen Jiang,Junru Shen,Baoliu Ye
DOI: https://doi.org/10.1109/tc.2023.3326982
IF: 3.183
2024-01-01
IEEE Transactions on Computers
Abstract:Log Structured Merge Tree (LSM tree) serves as the core data storage engine in modern key-value stores. Its adoption is rapidly accelerated with cloud computing and data center development. Acknowledging its widespread use, the LSM tree still faces severe performance issues such as write stall, write amplification, and read inefficiency. This article presents research on improving LSM-tree-based key-value store performance using emerging Non-Volatile Memory (NVM) technology. Our performance diagnosis reveals that the above-mentioned issues result primarily from intensive hot key-value data processing, which is compounded by slow storage devices. To address hotspot bottlenecks, we propose a split log-structured merge tree over hybrid storage by leveraging the intrinsic hot and cold data separation property of the LSM tree. Our approach promotes frequently accessed, small-sized high levels onto fast NVM and offloads the remaining cold, large-sized low levels into slow devices, effectively closing the performance gap for DRAM-disk-based LSM trees. Additionally, we optimize the split LSM tree read and write performance by proposing a variety of novel techniques. We build a hotspot-aware key-value database named SplitDB and perform extensive experiments. Experimental results demonstrate that SplitDB effectively prevents write stalls, achieves a 6-fold write reduction, and improves read throughputs by 3.5 times compared to state-of-the-art key-value databases.
What problem does this paper attempt to address?