Visual-Semantic Aligned Bidirectional Network for Zero-Shot Learning
Rui Gao,Xingsong Hou,Jie Qin,Yuming Shen,Yang Long,Li Liu,Zhao Zhang,Ling Shao
DOI: https://doi.org/10.1109/tmm.2022.3145666
IF: 7.3
2022-01-01
IEEE Transactions on Multimedia
Abstract:Zero-shot learning (ZSL) aims to recognize unknown categories that are unavailable during training. Recently, generative models have shown the potential to address this challenging problem by synthesizing unseen features conditioned on semantic embeddings such as attributes. However, unidirectional generative models cannot guarantee the effective coupling between visual and semantic spaces. To this end, we propose a visual-semantic aligned bidirectional network with cycle consistency to alleviate the gap between these two spaces, generating unseen features of high quality. More importantly, we incorporate two carefully designed strategies into our bidirectional framework to improve the overall ZSL performance. Specifically, we enhance the intra-domain class divergence in both visual and semantic spaces, and in the meantime, mitigate the inter-domain shift to preserve seen-unseen domain discrimination. Experimental results on four standard benchmarks show the superiority of our framework over existing state-of-the-art methods under both conventional and generalized ZSL settings.
computer science, information systems,telecommunications, software engineering