Benefit from AMR: Image Captioning with Explicit Relations and Endogenous Knowledge

Feng Chen,Xinyi Li,Jintao Tang,Shasha Li,Ting Wang
DOI: https://doi.org/10.1007/978-981-97-2390-4_25
2024-01-01
Abstract:Recent advanced image captioning methods mostly explore implicit relationships among objects by object-based visual feature modeling, while failing to capture the explicit relations and achieve semantic association. To tackle these problems, we present a novel method based on Abstract Meaning Representation (AMR) in this paper. Specifically, in addition to implicit relationship modeling of visual features, we design an AMR generator to extract explicit relations of images and further model these relations during generation. Besides, we construct an AMR-based endogenous knowledge graph, which helps extract prior knowledge for semantic association, strengthening the semantic expression ability of the captioning model without any external resources. Extensive experiments are conducted on the public MS COCO dataset, and results show that the AMR-based explicit semantic features and the associated semantic features can further boost image captioning to generate higher-quality captions.
What problem does this paper attempt to address?