Multi-modal Generative Adversarial Network for Zero-Shot Learning

Zhong Ji,Kexin Chen,Junyue Wang,Yunlong Yu,Zhongfei Zhang
DOI: https://doi.org/10.1016/j.knosys.2020.105847
IF: 8.139
2020-01-01
Knowledge-Based Systems
Abstract:In this paper, we propose a novel approach for Zero-Shot Learning (ZSL), where the test instances are from the novel categories that no visual data are available during training. The existing approaches typically address ZSL by embedding the visual features into a category-shared semantic space. However, these embedding-based approaches easily suffer from the “heterogeneity gap” issue since a single type of class semantic prototype cannot characterize the categories well. To alleviate this issue, we assume that different class semantics reflect different views of the corresponding class, and thus fuse various types of class semantic prototypes resided in different semantic spaces with a feature fusion network to generate pseudo visual features. Through the adversarial mechanism of the real visual features and the fused pseudo visual features, the complementary semantics in various spaces are effectively captured. Experimental results on three benchmark datasets demonstrate that the proposed approach achieves impressive performances on both traditional ZSL and generalized ZSL tasks.
What problem does this paper attempt to address?