SELSH: A Hashing Scheme for Approximate Similarity Search with Early Stop Condition

Jie Chen,Chengkun He,Gang Hu,Jie Shao
DOI: https://doi.org/10.1007/978-3-319-27674-8_10
2016-01-01
Abstract:Similarity search is a fundamental problem in various multimedia database applications. Due to the phenomenon of “curse of dimensionality”, the performance of many access methods decreases significantly when the dimensionality increases. Approximate similarity search is an alternative solution, and Locality Sensitive Hashing (LSH) is the most popular method for it. Nevertheless, LSH needs to verify a large number of points to get good-enough results, which incurs plenty of I/O cost. In this paper, we propose a new scheme called SortedKey and Early stop LSH (SELSH), which extends the previous SortingKeys-LSH (SK-LSH). SELSH uses a linear order to sort all the compound hash keys. Moreover, during query processing an early stop condition and a limited page number are used to determine whether a page needs to be accessed. Our experiments demonstrate the superiority of the proposed method against two state-of-the-art methods, C2LSH and SK-LSH.
What problem does this paper attempt to address?