COLIN: A Cache-Conscious Dynamic Learned Index with High Read/Write Performance

Zhou Zhang,Pei-Quan Jin,Xiao-Liang Wang,Yan-Qi Lv,Shou-Hong Wan,Xi-Ke Xie
DOI: https://doi.org/10.1007/s11390-021-1348-2
IF: 1.871
2021-07-01
Journal of Computer Science and Technology
Abstract:The recently proposed learned index has higher query performance and space efficiency than the conventional B+-tree. However, the original learned index has the problems of insertion failure and unbounded query complexity, meaning that it supports neither insertions nor bounded query complexity. Some variants of the learned index use an out-of-place strategy and a bottom-up build strategy to accelerate insertions and support bounded query complexity, but introduce additional query costs and frequent node splitting operations. Moreover, none of the existing learned indices are cache-friendly. In this paper, aiming to not only support efficient queries and insertions but also offer bounded query complexity, we propose a new learned index called COLIN (Cache-cOnscious Learned INdex). Unlike previous solutions using an out-of-place strategy, COLIN adopts an in-place approach to support insertions and reserves some empty slots in a node to optimize the node's data placement. In particular, through model-based data placement and cache-conscious data layout, COLIN decouples the local-search boundary from the maximum error of the model. The experimental results on five workloads and three datasets show that COLIN achieves the best read/write performance among all compared indices and outperforms the second best index by 18.4%, 6.2%, and 32.9% on the three datasets, respectively.
computer science, software engineering, hardware & architecture
What problem does this paper attempt to address?