Deep Multimodal Embedding Model for Fine-grained Sketch-based Image Retrieval

Fei Huang,Yong Cheng,Cheng Jin,Yuejie Zhang,Tao Zhang
DOI: https://doi.org/10.1145/3077136.3080681
2017-08-07
Abstract:Fine-grained Sketch-based Image Retrieval (Fine-grained SBIR), which uses hand-drawn sketches to search the target object images, has been an emerging topic over the last few years. The difficulties of this task not only come from the ambiguous and abstract characteristics of sketches with less useful information, but also the cross-modal gap at both visual and semantic level. However, images on the web are always exhibited with multimodal contents. In this paper, we consider Fine-grained SBIR as a cross-modal retrieval problem and propose a deep multimodal embedding model that exploits all the beneficial multimodal information sources in sketches and images. In our experiment with large quantity of public data, we show that the proposed method outperforms the state-of-the-art methods for Fine-grained SBIR.
What problem does this paper attempt to address?