Review on HDD-Based, SSD-Based and Hybrid Key-Value Stores

Juan Li,Nong Xiao,Yutong Lu,Zhiguang Chen,Fang Liu,Yuxuan Xing,Shuo Li
DOI: https://doi.org/10.1109/dasc-picom-datacom-cyberscitec.2017.198
2017-01-01
Abstract:The explosion of data under big data and cloud computing backgrounds demands higher improvement on the performance of big data management technology and underlying storage systems. With the increasing proportion of unstructured data, NoSQL management is developing rapidly, especially the key-value store systems. Key-value stores can establish efficient data indexing for big data and data is organized as key-value pairs, where key is the only identifier of the data, and value is the data content that requires no model scheme. Key-value store can be built on various storage systems, i.e. Hard Disk Drives (HDDs), flash-memory-based Solid State Disks (SSDs) or the hybrid one. Normally, HDDs' performance is limited by the mechanical access characteristic, and the ratio of latency between random and sequential I/Os is 1000:1. By contrast, SSDs can provide both high random and sequential performance, and high concurrency; but careful handling of some intrinsic properties of SSDs is demanded, i.e. out-of-place updates and limited lifespan of flash. HDD-based, SSD-based and hybrid key value stores have proposed different optimization techniques to deal with intrinsic properties of underlying storage devices. In this paper, we review some widely-referenced research works of key-value store [29], emphasizing techniques on memory index overhead reduction [30] and lookup performance improvement; we also present some enlightening in-memory index designs in flash-based storage systems, though they are not pure key-value stores. Moreover, the advantages, disadvantages and prospects of these research works are also presented and discussed.
What problem does this paper attempt to address?