Constructing a Visual Relationship Authenticity Dataset

Chenhui Chu,Yuto Takebayashi,Mishra Vipul,Yuta Nakashima
DOI: https://doi.org/10.48550/arXiv.2010.05185
2020-10-11
Abstract:A visual relationship denotes a relationship between two objects in an image, which can be represented as a triplet of (subject; predicate; object). Visual relationship detection is crucial for scene understanding in images. Existing visual relationship detection datasets only contain true relationships that correctly describe the content in an image. However, distinguishing false visual relationships from true ones is also crucial for image understanding and grounded natural language processing. In this paper, we construct a visual relationship authenticity dataset, where both true and false relationships among all objects appeared in the captions in the Flickr30k entities image caption dataset are annotated. The dataset is available at <a class="link-external link-https" href="https://github.com/codecreator2053/VR_ClassifiedDataset" rel="external noopener nofollow">this https URL</a>. We hope that this dataset can promote the study on both vision and language understanding.
Computer Vision and Pattern Recognition,Artificial Intelligence,Computation and Language
What problem does this paper attempt to address?