Dual-focus transfer network for zero-shot learning

Zhen Jia,Zhang Zhang,Caifeng Shan,Liang Wang,Tieniu Tan
DOI: https://doi.org/10.1016/j.neucom.2023.126264
IF: 6
2023-04-28
Neurocomputing
Abstract:Zero-shot learning aims to recognize image categories which are "unseen" in the training phase of image classification models. The key to this task is to transfer the learned knowledge from "seen" classes to "unseen" classes. In order to make the knowledge transfer process more effective, we propose to exploit both the visual and semantic attention mechanisms simultaneously in zero-shot learning tasks. Specifically, a dual-focus transfer network (DFTN) model is proposed to implement attention mechanisms from both the visual and semantic ends in a mapping based zero-shot learning framework with a visual focus transfer (VFT) module and a semantic focus transfer (SFT) module. The VFT module is composed by multi-head self-attention networks, which endows salient parts of images with greater weights at different resolutions of the feature maps. The SFT module generates semantic weights to re-weight semantic attribute features with the guidance of visual representations, where the semantic attributes corresponding to more visual discrimination capability will obtain greater weights. Extensive experiments of zero-shot learning and generalized zero-shot learning on five representative benchmarks demonstrate the superiority of the proposed DFTN model, compared to other state-of-the-art methods.
computer science, artificial intelligence
What problem does this paper attempt to address?