Rectangular Hash Table: Bloom Filter And Bitmap Assisted Hash Table With High Speed

Tong Yang,Binchao Yin,Hang Li,Muhammad Shahzad,Steve Uhlig,Bin Cui,Li Xiaoming.
DOI: https://doi.org/10.1109/BigData.2017.8257999
2017-01-01
Abstract:Hash table, a widely used data structure, can achieve an O(1) average lookup speed at the cost of large memory usage. Unfortunately, hash tables suffer from collisions and the rate of collisions is largely determined by the load factor. Broadly speaking, existing research has taken two approaches to improve the performance of hash tables. The first approach trades-off collision rate with memory usage, but only works well under low load. The second approach pursues high load and no hash collisions, but comes with update failures. The goal of this paper is to design a practical and efficient hash table that achieves high load factor, low hash collision rate, fast lookup speed, fast update speed, and zero update failures. To achieve this goal, we take a three-step approach. First, we propose a set of hashing techniques that leverage Bloom filters to significantly reduce hash collision rates. Second, we introduce a novel kick mechanism to achieve a high load factor. Last, we develop bitmaps to significantly accelerate the kick mechanism. Theoretical analysis and experimental results show that our hashing schemes significantly outperform the state-of-the-art. Our hash table achieves a high load factor (greater than 95%), a low collision rate (less than 0.56%), and the number of hash buckets almost equals to the number of key-value pairs. Given n key-value pairs, the collision rate is reduced to 0 by either using 1.18 xn buckets or allowing up to 5 blind kicks. We have released the source code of the implementations of our hash table and of 6 prior hash tables at Github [1].
What problem does this paper attempt to address?