Cross-Domain Alignment for Zero-Shot Sketch-Based Image Retrieval
Xu Wang,Dezhong Peng,Peng Hu,Yunhong Gong,Yong Chen
DOI: https://doi.org/10.1109/TCSVT.2023.3265697
IF: 5.859
2023-01-01
IEEE Transactions on Circuits and Systems for Video Technology
Abstract:Zero-Shot Sketch-Based Image Retrieval (ZS-SBIR) is a rising theme with broad application prospects. Given the sketch image as a query, the goal of ZS-SBIR is to correctly retrieve the semantically similar images under the zero-shot scenario. The key is to project images from photo and sketch domains into a shared space, where the domain gap and semantic gap are effectively bridged. Most previous studies have approached ZS-SBIR as a classification problem and used classification loss to obtain discriminative features. However, these methods do not explicitly encourage the alignment of features, degrading the retrieval performance. To address this issue, this paper proposes a novel method called Cross-domain Alignment (CA) for ZS-SBIR. Specifically, we present a Large-margin Cross-domain Contrastive (LCC) loss to stimulate intra-class compactness and inter-class separability from both domains, motivated by the relationships of pairwise distances in metric learning. The loss boosts features' alignment and enjoys more discrimination. Moreover, based on the "embedding stability" phenomenon of the neural network, we elaborate a Cross-batch Semantic Metric (CSM) mechanism for boosting the performance of ZS-SBIR. Extensive experiments demonstrate that the proposed CA achieves encouraging performance on the challenging Sketchy and TU-Berlin benchmarks.
What problem does this paper attempt to address?