Emotion Interpretational Caption of Art Visual based on Reinforcement Learning

Ting Luo,Xiaodan Zhang,Jinye Peng,Dekui Wang,Wei Zhou
DOI: https://doi.org/10.1109/ICIPMC58929.2023.00021
2023-01-01
Abstract:Caption generation for the emotional interpretation of art is a challenging task. The difficulty lies in the fact that the model needs to generate captions that are consistent with human cognition, through learning better the interrelations among images, emotions, and corresponding texts. However, the current emotion interpretation task follows the traditional caption model that is unable to improve objective metrics (such as BLEUI-4, CIDEr) and emotion metrics (consistency of emotion interpretation) simultaneously. It proved the model still has great potential for improvement. Therefore, this paper designs an emotion interpretational caption model based on reinforcement learning. It consists of auxiliary tasks, a traditional captioning model, and reinforcement learning, in which reinforcement learning considers the reward of emotional concepts to optimize performance, avoiding bias in the interpretation of emotional types. The framework is validated through an exhaustive analysis, both quantitative and qualitative, as well as diversity, demonstrating outstanding results in terms of both objective indicators and emotional indicators. This proved the effectiveness of reinforcement learning considering emotional concepts in improving the potential performance of the emotional interpretation captions model.
What problem does this paper attempt to address?