ArtCap: A Dataset for Image Captioning of Fine Art Paintings

Yue Lu,Chao Guo,Xingyuan Dai,Fei-Yue Wang
DOI: https://doi.org/10.1109/tcss.2022.3223539
2022-01-01
IEEE Transactions on Computational Social Systems
Abstract:The image captioning of fine art paintings aims at generating content descriptions for the paintings. Due to the complexity of modeling both image and language, this task usually needs sufficient training data. However, different from photographic image captioning, there are few satisfactory datasets for painting captioning. In this article, we introduce a painting captioning dataset (named the ArtCap dataset), which contains 3606 paintings and five descriptions for each painting. We present the carefully designed construction pipeline of our dataset and further evaluate our dataset from two aspects of annotation quality and application effectiveness, respectively. For the annotation quality, we compare the global characteristics, annotation content, and annotation consistency of our dataset with other painting descriptions datasets. For application effectiveness, we employ our dataset and other painting descriptions datasets to train image captioning models and analyze the captioning performances. The results demonstrate the promising annotation quality and application effectiveness of our dataset.
What problem does this paper attempt to address?