Compositional Correlation Quantization for Large-Scale Multimodal Search.

Mingsheng Long,Jianmin Wang,Philip S. Yu
2015-01-01
Abstract:Efficient similarity retrieval from large-scale multimodal database is pervasive in current search systems with the big data tidal wave. To support queries across content modalities, the system should enable cross-modal correlation and computation-efficient indexing. While hashing methods have shown great potential in approaching this goal, current attempts generally failed to learn isomorphic hash codes in a seamless scheme, that is, they embed multiple modalities into a continuous isomorphic space and then threshold embeddings into binary codes, which incurred substantial loss of search quality. In this paper, we establish seamless multimodal hashing by proposing a novel Compositional Correlation Quantization (CCQ) model. Specifically, CCQ jointly finds correlation-maximal mappings that transform different modalities into an isomorphic latent space, and learns compositional quantizers that quantize the isomorphic latent features into compact binary codes. An optimization framework is developed to preserve both intra-modal similarity and inter-modal correlation while minimizing both reconstruction and quantization errors, which can be trained from both paired and unpaired data in linear time. A comprehensive set of experiments clearly show the superior effectiveness and efficiency of CCQ against the state-of-the-art techniques on both unimodal and cross-modal search tasks.
What problem does this paper attempt to address?