On Learning Semantic Representations for Large-Scale Abstract Sketches
Peng Xu,Yongye Huang,Tongtong Yuan,Tao Xiang,Timothy M. Hospedales,Yi-Zhe Song,Liang Wang
DOI: https://doi.org/10.1109/tcsvt.2020.3041586
IF: 5.859
2021-09-01
IEEE Transactions on Circuits and Systems for Video Technology
Abstract:In this paper, we focus on learning semantic representations for large-scale highly abstract sketches that were produced by the practical sketch-based application rather than the excessively well dawn sketches obtained by crowd-sourcing. We propose a dual-branch CNN-RNN network architecture to represent sketches, which simultaneously encodes both the static and temporal patterns of sketch strokes. Based on this architecture, we further explore learning the sketch-oriented semantic representations in two practical settings, i.e., hashing retrieval and zero-shot recognition on million-scale highly abstract sketches produced by practical online interactions. Specifically, we use our dual-branch architecture as a universal representation framework to design two sketch-specific deep models: (i) We propose a deep hashing model for sketch retrieval, where a novel hashing loss is specifically designed to further accommodate both the abstract and messy traits of sketches. (ii) We propose a deep embedding model for sketch zero-shot recognition, via collecting a large-scale edge-map dataset and proposing to extract a set of semantic vectors from edge-maps as the semantic knowledge for sketch zero-shot domain alignment. Both deep models are evaluated by comprehensive experiments on million-scale abstract sketches produced by a global online game QuickDraw and outperform state-of-the-art competitors.
engineering, electrical & electronic