Sparse Spectral Hashing

Jian Shao,Fei Wu,Chuanfei Ouyang,Xiao Zhang
DOI: https://doi.org/10.1016/j.patrec.2011.10.018
IF: 4.757
2012-01-01
Pattern Recognition Letters
Abstract:A better similarity index structure for high-dimensional feature datapoints is very desirable for building scalable content-based search systems on feature-rich dataset. In this paper, we introduce sparse principal component analysis (Sparse PCA) and Boosting Similarity Sensitive Hashing (Boosting SSC) into traditional spectral hashing for both effective and data-aware binary coding for real data. We call this Sparse Spectral Hashing (SSH). SSH formulates the problem of binary coding as a thresholding a subset of eigenvectors of the Laplacian graph by constraining the number of nonzero features. The convex relaxation and eigenfunction learning are conducted in SSH to make the coding globally optimal and effective to datapoints outside the training data. The comparisons in terms of F1 score and AUC show that SSH outperforms other methods substantially over both image and text datasets.
What problem does this paper attempt to address?