Boosting Fine-Grained Sketch-Based Image Retrieval with Self-Supervised Learning

Rui Feng,Yuejie Zhang,Zhaolong Zhang,T. Zhang,Yangdong Chen
DOI: https://doi.org/10.1109/ICASSP49357.2023.10095145
2023-06-04
Abstract:Fine-grained sketch-based image retrieval (FG-SBIR) aims at aligning images and sketches at the instance level. It is a challenging task as there are significant differences between sketch and image. Existing methods usually produce less desired performance due to the lack of large-scale fine-grained image-sketch datasets and the strong dependence on the classification models pretrained on ImageNet. In this paper, we propose a better self-supervised pre-trained FG-SBIR model which does not depend on large-scale annotated datasets. Only images and their corresponding edge maps are used at the pre-training stage. Mixed modal transformation is designed to generate different mixed-up views. The FG-SBIR model is pre-trained by minimizing the distance between the views of the same instance and then fine-tuned by a simple triplet loss. With a plain downstream network, it achieves generally better performance than state-of-the-art models on three widely used FG-SBIR datasets.
Computer Science
What problem does this paper attempt to address?