Transductive Zero-Shot Learning With Adaptive Structural Embedding

Yunlong Yu,Zhong Ji,Jichang Guo,Yanwei Pang
DOI: https://doi.org/10.1109/TNNLS.2017.2753852
Abstract:Zero-shot learning (ZSL) endows the computer vision system with the inferential capability to recognize new categories that have never seen before. Two fundamental challenges in it are visual-semantic embedding and domain adaptation in cross-modality learning and unseen class prediction steps, respectively. This paper presents two corresponding methods named Adaptive STructural Embedding (ASTE) and Self-PAced Selective Strategy (SPASS) for both challenges. Specifically, ASTE formulates the visual-semantic interactions in a latent structural support vector machine framework by adaptively adjusting the slack variables to embody different reliablenesses among training instances. To alleviate the domain shift problem in ZSL, SPASS borrows the idea from self-paced learning by iteratively selecting the unseen instances from reliable to less reliable to gradually adapt the knowledge from the seen domain to the unseen domain. Consequently, by combining SPASS and ASTE, we present a self-paced Transductive ASTE (TASTE) method to progressively reinforce the classification capacity. Extensive experiments on three benchmark data sets (i.e., AwA, CUB, and aPY) demonstrate the superiorities of ASTE and TASTE. Furthermore, we also propose a fast training (FT) strategy to improve the efficiency of most existing ZSL methods. The FT strategy is surprisingly simple and general enough, which speeds up the training time of most existing ZSL methods by 4~300 times while holding the previous performance.
What problem does this paper attempt to address?