Sogou-QCL

Yukun Zheng,Zhen Fan,Yiqun Liu,Cheng Luo,Min Zhang,Shaoping Ma
DOI: https://doi.org/10.1145/3209978.3210092
2018-01-01
Abstract:Data is of vital importance in the development of machine learning technologies. Recently, within the information retrieval field, a number of neural ranking frameworks have been proposed to address the ad-hoc search. These models usually need a large amount of query-document relevance judgments for training. However, obtaining this kind of relevance judgments needs a lot of money and manual effort. To shed light on this problem, researchers seek to use implicit feedback from users of search engines to improve the ranking performance. In this paper, we present a new dataset, Sogou-QCL, which contains 537,366 queries and five kinds of weak relevance labels for over 12 million query-document pairs. We apply Sogou-QCL dataset to train recent neural ranking models and show its potential to serve as weak supervision for ranking. We believe that Sogou-QCL will have a broad impact on corresponding areas.
What problem does this paper attempt to address?