ESE-GAN: Zero-Shot Food Image Classification Based on Low Dimensional Embedding of Visual Features

Gaojie Li,Yaochen Li,Jingle Liu,Wei Guo,Wenneng Tang,Yuehu Liu
DOI: https://doi.org/10.1109/tmm.2024.3353457
IF: 7.3
2024-01-01
IEEE Transactions on Multimedia
Abstract:Existing zero-shot learning based image classification methods transform the zero-shot learning problem into supervised learning by applying generative adversarial network (GAN) to synthesize visual features of unseen classes. However, the visual features generated by the generator tend to be biased towards seen classes, and the discriminator is too weak to generate high-quality image features. To solve these problems, we propose a novel zero-shot food image classification method based on low dimensional embedding of visual features. Our method applies reinforced semantic guidance to increase the discriminative ability of the model by enhancing the strong distribution of input features. Moreover, the visual space is utilized as the embedding space to reduce the bias towards seen classes by reducing the distance between semantic information and visual features in the embedding space. Finally, the feature distribution of unseen classes is further specified by improving the prototype similarity function. Extensive experiments on three food datasets and four general benchmark datasets demonstrate the effectiveness of the proposed method.
What problem does this paper attempt to address?