Abstract:With the scale of data to store or monitor in nowadays network constantly increasing, hash based data structures are more and more widely used because of their high memory efficiency and high speed. Most of them, like Bloom filters, sketches and d-Ieft hash tables use more than one hash function. Furthermore, in order to achieve good randomicity, the hash functions used, like MD5 and SHA1, are very complicated and consume a lot of CPU cycles to carry out. As a consequence, the implementation of these hash functions will be time-consuming, In order to address this issue, we propose Single Hash technique in this paper. It is based on the observation that the hash functions we use produce 32-bit or M-bit values which have much bigger value ranges than that we need in practice. We usually have to carry out modular operation to map the hash results into a smaller range in the data structures listed above. In this procedure, information carried by the high bits may be discarded. For example, if in a Bloom filter the length of the bit array is 220 while the hash functions we use are 32-bit hash functions, there are 12 bits in the results of the hash functions discarded in the procedure of modular. We can use these bits to produce more hash values. Therefore, we propose to use a few bit operations to make full use of the information produced by one hash function and generate multiple hash values which can be used in these data structures. Single Hash technique can be applied to most of the hash based data structures. It can significantly improve their speed, because instead of carrying out multiple hash functions, we only need to compute one hash function and a few simple operations (e.g., bit shift and XOR). Other aspects of performance, like memory efficiency and accuracy of these data structures will not be influenced by Single Hash technique. In this paper, we apply it to three kinds of classic hash based data structures, i.e., Bloom filters, CM sketches and d-Ieft hash tables as case studies, and evaluate their performance with both mathematical analysis and extensive experiments. We make all our codes open source on Github.

Skip Hash: A Fast Ordered Map Via Software Transactional Memory

GPU Lock-Free Hopscotch Hash Table

Cache-Aware Lock-Free Concurrent Hash Tries

A Concurrent Skip List Balanced On Search

Concurrent Deterministic Skiplist and Other Data Structures

PhaST: Hierarchical Concurrent Log-Free Skip List for Persistent Memory

Rectangular Hash Table: Bloom Filter And Bitmap Assisted Hash Table With High Speed

Shifting Hash Table: An Efficient Hash Table With Delicate Summary

Efficient Almost Wait-Free Parallel Accessible Dynamic Hashtables

A Fair and Memory/Time-efficient Hashmap

DHash: Dynamic Hash Tables With Non-Blocking Regular Operations

Revisiting Persistent Hash Table Design for Commercial Non-Volatile Memory

Lock-free Dynamic Hash Tables with Open Addressing.

Adaptive Lock-Free Data Structures in Haskell: A General Method for Concurrent Implementation Swapping

Fast Consistent Hashing in Constant Time

A Skew-Insensitive Hashing Sync and Construction Scheme for Many-Core Coprocessors

Dynamic-Sized Nonblocking Hash Tables

Single Hash: Use One Hash Function to Build Faster Hash Based Data Structures

Hashkv: Enabling Efficient Updates In Kv Storage Via Hashing

Lock-free Linearizable 1-Dimensional Range Queries

DLHT: A Non-blocking Resizable Hashtable with Fast Deletes and Memory-awareness