Towards More Usable Dataset Search: From Query Characterization to Snippet Generation

Jinchi Chen,Xiaxia Wang,Gong Cheng,Evgeny Kharlamov,Yuzhong Qu
DOI: https://doi.org/10.48550/arXiv.1908.11146
2019-08-29
Information Retrieval
Abstract:Reusing published datasets on the Web is of great interest to researchers and developers. Their data needs may be met by submitting queries to a dataset search engine to retrieve relevant datasets. In this ongoing work towards developing a more usable dataset search engine, we characterize real data needs by annotating the semantics of 1,947 queries using a novel fine-grained scheme, to provide implications for enhancing dataset search. Based on the findings, we present a query-centered framework for dataset search, and explore the implementation of snippet generation and evaluate it with a preliminary user study.
What problem does this paper attempt to address?