Multi-dimensional Hash Table Structure for Massive Data Applications

WU Quanyuan,PENG Can,ZHENG Yi,BU Junli
DOI: https://doi.org/10.16511/j.cnki.qhdxxb.2017.26.023
2017-01-01
Abstract:Traditional Hash table can quickly locate the target by calculating the Hash value of the target data to enable fast data access and retrieval. Good storage performance requires that the Hash table maintains a loose state by sacrificing 10% —25% of the space. This is a tremendous waste of space in massive data storage systems. This paper presents a multi-dimensional Hash table structure that by increasing the logical dimension of the Hash table to significantly reduce the collision rate in the Hash table for satisfactory performance with a high filling rate. Tests show that with ten million entries, the collision rate of a two-dimensional Hash table is 2—4 orders of magnitude lower than a traditional Hash table and the overall performance is improved by 1 order of magnitude. In addition, a failure rate concept is proposed to improve Hash table performance evaluations.
What problem does this paper attempt to address?