Fine-grained image emotion captioning based on Generative Adversarial Networks
Chunmiao Yang,Yang Wang,Liying Han,Xiran Jia,Hebin Sun,Wang, Yang
DOI: https://doi.org/10.1007/s11042-024-18680-4
IF: 2.577
2024-03-09
Multimedia Tools and Applications
Abstract:Image captioning, which combines natural language processing and computer vision, has developed rapidly in recent years. It tends to be applied in data retrieval, blind navigation, intelligent transportation, smart home, medical assistance, news media and other domains. In order to elevate the consistency and abundance of image captioning languages and express people's subjective emotions effectively, a Generative Adversarial Network (GAN) is applied in this paper to obtain multi-stylized image emotion captions and generate two captions containing positive and negative emotions, respectively. Among them, Residual Network (ResNet) and Gate Recurrent Unit (GRU) are integrated into the generator, while the capsule neural network is applied to the discriminator. We conduct experiments on the popular MSCOCO and Senticap datasets to validate the model and demonstrate its satisfied performance in comparison to current advanced image captioning approaches.
computer science, information systems, theory & methods,engineering, electrical & electronic, software engineering