CLAREL: Classification via retrieval loss for zero-shot learning

Boris N. Oreshkin,Negar Rostamzadeh,Pedro O. Pinheiro,Christopher Pal
DOI: https://doi.org/10.1109/cvprw50498.2020.00466
2020-06-01
Abstract:We address the problem of learning cross-modal representations. We propose an instance-based deep metric learning approach in joint visual and textual space. The key novelty of this paper is that it shows that using per-image semantic supervision leads to substantial improvement in zero-shot performance over using class-only supervision. We also provide a probabilistic justification and empirical validation for a metric rescaling approach to balance the seen/unseen accuracy in the GZSL task. We evaluate our approach on two fine-grained zero-shot datasets: cub and flowers.
What problem does this paper attempt to address?