U2-Tree: A Universal Two-Layer Distributed Indexing Scheme for Cloud Storage System
Xiaofeng Gao,Yuanning Gao,Yichen Zhu,Guihai Chen
DOI: https://doi.org/10.1109/tnet.2019.2891008
2019-01-01
IEEE/ACM Transactions on Networking
Abstract:The indices in cloud storage systems manage the stored data and support diverse queries efficiently. Secondary index, the index built on the attributes other than the primary key, facilitates a variety of queries for different purposes. An efficient design of secondary indices is called two-layer indexing scheme. It divides indices in the system into the global index layer and the local index layer. However, previous works on two-layer indexing are mainly on a P2P overlay network. In this paper, we propose U-2-Tree, a universal two-layer distributed indexing scheme built on data center networks with tree-like topologies. To construct the U-2-Tree, we first build local index according to data features and, then, assign potential indexing range of the global index for each host based on the distribution rule of local data. After that, we use several false positives control techniques, including gap elimination and Bloom filter, to publish meta-data about local index to global index host. In the final step, the global index collects published information and uses tree data structures to organize them. In our design, we take advantage of the topological properties of tree-like topologies, introduce and compare detailed optimization techniques in the construction of two-layer indexing scheme. Furthermore, we discuss the index updating, index tuning, and the fault tolerance of U-2-Tree. Finally, we validate the effectiveness and efficiency of U-2-Tree by giving a series of theoretical analyses and conducting numerical experiments on Amazon EC2 platform.