Generative-based hybrid model with semantic representations for generalized zero-shot learning

Emre Akdemir,Necaattin Barisci
DOI: https://doi.org/10.1007/s11760-024-03734-9
IF: 1.583
2024-12-04
Signal Image and Video Processing
Abstract:Generalized Zero-Shot Learning (GZSL) endeavors to recognize instances of seen and unseen classes using semantic information and labeled instances of only seen classes. In addressing the data imbalance problem in GZSL, generator networks produce synthetic features out of visual features and semantic features. However, the original semantic features exhibit insufficient distinctiveness. Throughout the current study, a hybrid generative-based GZSL framework that combines semantic descriptions used in fine-grained data sets with semantic attributes is proposed to increase the distinctiveness of the original semantic features. The proposed model offers semantic information enriched by the use of semantic attributes and semantic descriptions. A generalized classification is provided by using Generative Adversarial Network (GAN) and Variational Autoencoder (VAE) to generate visual features of unseen classes from this semantic information. The proposed method has been evaluated through extensive experiments on AWA2, CUB, FLO and SUN GZSL benchmark datasets. The experiments showed AWA2 70.96, CUB 76.71, FLO 92.12 and SUN 42.84% performance. Notably the proposed method achieved superior GZSL results, 3.41% and 3.52% on fine-grained CUB and FLO datasets respectively.
engineering, electrical & electronic,imaging science & photographic technology
What problem does this paper attempt to address?