Towards Personalized Aesthetic Image Caption

Kun Xiong,Liu Jiang,Xuan Dang,Guolong Wang,Wenwen Ye,Zheng Qin
DOI: https://doi.org/10.1109/ijcnn48605.2020.9206953
2020-01-01
Abstract:Image captioning (IC) is a commonly-used technique for generating textual image description, which finds its applications on semantic image retrieval and multi-modal image understanding, among many others. This paper focuses on an important IC method specialized for generating aesthetic descriptions of images, i.e., aesthetic image captioning (AIC). Despite some effectiveness of initial work on AIC, their performances are inherently limited due to a lack of consideration of user preferences on aesthetics and better aesthetic feature, making it unusable for real-world applications where human users present a large variation on evaluating visual aesthetics of images. To tackle this, we propose a novel personalized aesthetic image caption (PAIC) approach for capturing and incorporating user preferences for AIC tasks. Our approach mainly contains Aesthetic feature Extraction Network(AEN), User Encoder network(UEN) and a personalized image caption model. AEN is designed to extract more expressive feature, UEN is introduced for learning the user vector from the limited information in our AVA-PCap dataset. Personalized image caption model is constructed to generate the caption when given the user id and photo pairs. The experimental results show that our methods outperform baselines by 10% , which is encouraging for a first step towards personalized aesthetic image caption.
What problem does this paper attempt to address?