Enhanced Locality Sensitive Clustering in High Dimensional Space

Gang Chen,Hao-Lin Gao,Bi-Cheng Li,Guo-En Hu
DOI: https://doi.org/10.4313/teem.2014.15.3.125
2014-01-01
Transactions on Electrical and Electronic Materials
Abstract:A dataset can be clustered by merging the bucket indices that come from the random projection of locality sensitive hashing functions. It should be noted that for this to work the merging interval must be calculated first. To improve the feasibility of large scale data clustering in high dimensional space we propose an enhanced Locality Sensitive Hashing Clustering Method. Firstly, multiple hashing functions are generated. Secondly, data points are projected to bucket indices. Thirdly, bucket indices are clustered to get class labels. Experimental results showed that on synthetic datasets this method achieves high accuracy at much improved cluster speeds. These attributes make it well suited to clustering data in high dimensional space.
What problem does this paper attempt to address?