Fast Binary Embedding for High-Dimensional Data
Felix X. Yu,Yunchao Gong,Sanjiv Kumar
DOI: https://doi.org/10.1007/978-3-319-14998-1_16
2015-01-01
Abstract:Binary embeddingBinary embedding of high-dimensional Gong, YunchaoKumar, SanjivYu, Felix X.data requires long codes to preserve the discriminative power of the input space. Traditional binary coding methods often suffer from very high computation and storage costs in such a scenario. To address this problem, we propose two solutions which improve over existing approaches. The first method, Bilinear Binary EmbeddingBilinear Binary Embedding (BBE)Binary embedding (BBE), converts high-dimensional data to compact similarity-preserving binary codes using compact bilinear projectionsbilinear projection. Compared to methods that use unstructured matrices for projection, it improves the time complexity from O(d2)$$\mathcal {O}(d^2)$$ to O(d1.5)$$\mathcal {O}(d^{1.5})$$, and the space complexity from O(d2)$$\mathcal {O}(d^2)$$ to O(d)$$\mathcal {O}(d)$$ where d$$d$$ is the input dimensionality. The second method, Circulant Binary EmbeddingCirculant Binary Embedding (CBE) (CBE), generates binary codes by projecting the data with a circulant matrixcirculant matrix. The circulant structure enables the use of Fast Fourier Transformation to speed up the computation. This further improves the time complexity to O(dlogd)$$\mathcal {O}(d\log {d})$$. For both BBE and CBE, we propose to learn the projections in a data-dependent fashion. We show by extensive experiments that the proposed approaches give much better performance than the state of the arts for fixed time, and provides much faster computation with no performance degradation for fixed number of bits. The BBE and CBE methods were previously presented in [6, 38]. In this book chapter, we present the two approaches in a unified framework, covering randomized binary embedding, learning-based binary embeddingbinary embedding, and learning with dimension reductions. We also discuss the choice of algorithms.