Image Captioning with Affective Guiding and Selective Attention

Anqi Wang,Haifeng Hu,Liang Yang
DOI: https://doi.org/10.1145/3226037
IF: 4.094
2018-01-01
ACM Transactions on Multimedia Computing Communications and Applications
Abstract:Image captioning is an increasingly important problem associated with artificial intelligence, computer vision, and natural language processing. Recent works revealed that it is possible for a machine to generate meaningful and accurate sentences for images. However, most existing methods ignore latent emotional information in an image. In this article, we propose a novel image captioning model with Affective Guiding and Selective Attention Mechanism named AG-SAM. In our method, we aim to bridge the affective gap between image captioning and the emotional response elicited by the image. First, we introduce affective components that capture higher-level concepts encoded in images into AG-SAM. Hence, our language model can be adapted to generate sentences that are more passionate and emotive. In addition, a selective gate acting on the attention mechanism controls the degree of how much visual information AG-SAM needs. Experimental results have shown that our model outperforms most existing methods, clearly reflecting an association between images and emotional components that is usually ignored in existing works.
What problem does this paper attempt to address?