Compound Projection Learning for Bridging Seen and Unseen Objects

Wenli Song,Lei Zhang,Xinbo Gao
DOI: https://doi.org/10.1109/tmm.2022.3142958
IF: 7.3
2022-01-01
IEEE Transactions on Multimedia
Abstract:Zero-shot Learning (ZSL) aims to transfer knowledge from seen image categories to unseen ones by leveraging semantic information. It is generally assumed that the seen and unseen objects share a common semantic space. Most of existing ZSL methods focus on how to connect the visual space and the semantic space. However, since there are some visual distribution differences between seen and unseen objects, the projection function learned by those seen classes is biased when transferring knowledge to unseen classes. We argue that, although the unseen objects are class-agnostic, the visual distribution information of unseen samples can be generated by exploiting semantic features. In this paper, we propose a Compound Projection Learning (CPL) model to transfer knowledge from seen to unseen objects by exploiting the information of both seen and class-agnostic samples. With the projected semantic representation by CPL, effective constraints such as projection loss and semantic reconstruction loss can be explored for seen and unseen objects, respectively, such that the semantic ambiguity across seen and unseen objects is reduced. Additionally, we utilize a similarity network to further explore the inter-class relationship by employing the labels and the similarities between seen and unseen classes. Extensive experiments on ZSL benchmark datasets show the effectiveness of our proposed approach.
computer science, information systems,telecommunications, software engineering
What problem does this paper attempt to address?