ACORDAR: A Test Collection for Ad Hoc Content-Based (RDF) Dataset Retrieval

Tengteng Lin,Qiaosheng Chen,Gong Cheng,Ahmet Soylu,Basil Ell,Ruoqi Zhao,Qing Shi,Xiaxia Wang,Yu Gu,Evgeny Kharlamov
DOI: https://doi.org/10.1145/3477495.3531729
2022-01-01
Abstract:Ad hoc dataset retrieval is a trending topic in IR research. Methods and systems are evolving from metadata-based to content-based ones which exploit the data itself for improving retrieval accuracy but thus far lack a specialized test collection. In this paper, we build and release the first test collection for ad hoc content-based dataset retrieval, where content-oriented dataset queries and content-based relevance judgments are annotated by human experts who are assisted with a dashboard designed specifically for comprehensively and conveniently browsing both the metadata and data of a dataset. We conduct extensive experiments on the test collection to analyze its difficulty and provide insights into the underlying task.
What problem does this paper attempt to address?