Semantic-rebased cross-modal hashing for scalable unsupervised text-visual retrieval

Weiwei Wang,Yuming Shen,Haofeng Zhang,Li Liu
DOI: https://doi.org/10.1016/j.ipm.2020.102374
2020-11-01
Abstract:<p>Recently, learning-based cross-modal hashing has gained increasing research interests for its low computation complexity and memory requirement. Among existing cross-modal techniques, supervised algorithms can gain better performance. However, due to the cost of acquiring labeled data, unsupervised methods become our choice when faced with large scale unlabeled web images. The label-free nature of unsupervised cross-modal hashing hinders models from exploiting the exact semantic data similarity. Existing research typically simulates the semantics by a heuristic geometric prior in the original feature space with pseudo labels or traditional dense graph structures. However, this introduces heavy bias into the model as the original features are not fully representing the underlying multi-view data relations, and these two structures may face with issues like interference noise or high sensitivity to cluster number. To address the problem above, in this paper, we propose a novel unsupervised sparse-graph based hashing method called Semantic-Rebased Cross-modal Hashing (SRCH). A novel '<em>Set-and-Rebase</em>' process is defined to initialize and update the cross-modal similarity graph of training data. In particular, we <em>set</em> the graph according to the intra-modal feature geometric basis and then alternately <em>rebase</em> it to update the edges within according to the hashing results. We develop an alternating optimization routine to <em>rebase</em> the graph and train the hashing auto-encoders with closed-form solutions so that the overall framework is efficiently trained. Our experimental results on benchmarked datasets demonstrate the superiority of our model against state-of-the-art algorithms.</p>
computer science, information systems,information science & library science
What problem does this paper attempt to address?