LDC: A Lower-Level Driven Compaction Method to Optimize SSD-Oriented Key-Value Stores

Yunpeng Chai,Yanfeng Chai,Xin Wang,Haocheng Wei,Ning Bao,Yushi Liang
DOI: https://doi.org/10.1109/ICDE.2019.00070
2019-01-01
Abstract:Log-structured merge (LSM) tree key-value (KV) stores have been widely deployed in many NoSQL and SQL systems, serving online big data applications such as social networking, bioinfomatics, graph processing, machine learning, etc. The batch processing of sorted data merging (i.e., compaction) in LSM-tree KV stores greatly improves the efficiency of writing, leading to good write performance and high space efficiency. Recently, some lazy compaction methods were proposed to further promote the system throughput through delaying the compaction to accumulate more data within a compaction batch. However, the batched writing manner also leads to significant tail latency, which is unacceptable for online processing, and the newly proposed lazy approaches worsen the tail latency problem. Furthermore, the unbalanced read/write performance of the widely deployed SSDs make the performance optimization harder. Aiming to optimize both the tail latency and the system throughput, in this paper, we propose a novel Lower-level Driven Compaction (LDC) method for LSM-tree KV stores. LDC breaks the limitations of the traditional upper-level driven compaction manner and triggers practical compaction actions by lower-level data. It has the benefits of both decreasing the compaction granularity effectively for smaller tail latency and reducing the write amplification of LSM-tree compaction for higher throughput. We have implemented LDC in LevelDB; the experimental results indicate that LDC can reduce the 99.9th percentile latency for 2.62 times compared with the traditional upper-level driven compaction mechanism, and achieve 56.7% ~ 72.3% higher system throughput at the same time.
What problem does this paper attempt to address?