ACNet: Approaching-and-Centralizing Network for Zero-Shot Sketch-Based Image Retrieval

Hao Ren,Ziqiang Zheng,Yang Wu,Hong Lu,Yang Yang,Ying Shan,Sai-Kit Yeung
DOI: https://doi.org/10.1109/tcsvt.2023.3248646
IF: 5.859
2023-01-01
IEEE Transactions on Circuits and Systems for Video Technology
Abstract:The huge domain gap between sketches and photos poses huge challenges for Sketch-Based Image Retrieval (SBIR). The Zero-Shot Sketch-Based Image Retrieval (ZS-SBIR) is more generic and practical but brings an even greater challenge: the additional knowledge gap between the seen and unseen categories. In order to simultaneously mitigate both gaps, we propose an Approaching-and-Centralizing Network (termed “ACNet”) to jointly optimize sketch-to-photo synthesis and image retrieval. The retrieval module guides the synthesis module to generate large amounts of diverse photo-like images that help the sketch domain gradually approach the photo domain to eliminate the domain gap, and thus better serves retrieval. Meanwhile, the retrieval module itself centralizes the embeddings of training samples for learning a similarity measurement to eliminate the knowledge gap. Our approach is simple yet effective, which achieves state-of-the-art performance on two widely used ZS-SBIR datasets and surpasses previous methods by a large margin (e.g., 8.2% improvement in terms of mAP@all on TU-Berlin Extended dataset).
engineering, electrical & electronic
What problem does this paper attempt to address?