Embedding Index Maintenance in Store Routines to Accelerate Secondary Index Building in HBase

Chun Cao,Weiyi Wang,Ying Zhang,Jian Lu
DOI: https://doi.org/10.1109/cloud.2018.00070
2018-01-01
Abstract:Secondary index is used to accelerate the queries on non-rowkey columns in HBase by maintaining index items synchronously or asynchronously. Although existing asynchronous indexes have less inserting overhead than synchronous ones, they still need additional process to repair the possible inconsistency. This paper proposes an approach of embedding index repairing into data maintenance to save the extra process and meanwhile reduce the consistency-persisting cost. We implement this approach into a store engine as well as the corresponding client API, coprocessor and index-delete queue to constitute an effective secondary index building system for HBase. Experiments on YCSB benchmark show that it achieves a good balance between read and write performance, as well as better stability than other index building approaches.
What problem does this paper attempt to address?