The ACM Multimedia 2024 Viual Spatial Description Grand Challenge

Yu Zhao,Hao Fei,Bobo Li,Meishan Zhang,Min Zhang
DOI: https://doi.org/10.1145/3664647.3688989
2024-01-01
Abstract:The Visual Spatial Description Challenge (VSD) is the first competition event focused on visual spatial understanding, organized under the auspices of the ACM Multimedia Conference 2024. The goal of the VSD challenge is to assess the the ability of models and systems to comprehend spatial concepts, relationships and other semantics from a scene presented with visual appearance. The VSD challenge provides two benchmark datasets for three subtasks, i.e., visual spatial relationship classification, single spatial description generation, and open-ended spatial description generation. The challenge details are available on https://lllogen.github.io/vsd-challenge.github.io/.
What problem does this paper attempt to address?