Abstract:The unbiased learning to rank (ULTR) problem has been greatly advanced by recent deep learning techniques and well-designed debias algorithms. However, promising results on the existing benchmark datasets may not be extended to the practical scenario due to the following disadvantages observed from those popular benchmark datasets: (1) outdated semantic feature extraction where state-of-the-art large scale pre-trained language models like BERT cannot be exploited due to the missing of the original text;(2) incomplete display features for in-depth study of ULTR, e.g., missing the displayed abstract of documents for analyzing the click necessary bias; (3) lacking real-world user feedback, leading to the prevalence of synthetic datasets in the empirical study. To overcome the above disadvantages, we introduce the Baidu-ULTR dataset. It involves randomly sampled 1.2 billion searching sessions and 7,008 expert annotated queries, which is orders of magnitude larger than the existing ones. Baidu-ULTR provides:(1) the original semantic feature and a pre-trained language model for easy usage; (2) sufficient display information such as position, displayed height, and displayed abstract, enabling the comprehensive study of different biases with advanced techniques such as causal discovery and meta-learning; and (3) rich user feedback on search result pages (SERPs) like dwelling time, allowing for user engagement optimization and promoting the exploration of multi-task learning in ULTR. In this paper, we present the design principle of Baidu-ULTR and the performance of benchmark ULTR algorithms on this new data resource, favoring the exploration of ranking for long-tail queries and pre-training tasks for ranking. The Baidu-ULTR dataset and corresponding baseline implementation are available at <a class="link-external link-https" href="https://github.com/ChuXiaokai/baidu_ultr_dataset" rel="external noopener nofollow">this https URL</a>.

Dense Re-Ranking with Weak Supervision for RDF Dataset Search

Investigating Weak Supervision in Deep Ranking.

Learning Domain‐specific Semantic Representation from Weakly Supervised Data to Improve Research Dataset Retrieval

Few-Shot Text Ranking with Meta Adapted Synthetic Weak Supervision

Towards Competitive Search Relevance For Inference-Free Learned Sparse Retrievers

Training Deep Ranking Model with Weak Relevance Labels

SPLADE: Sparse Lexical and Expansion Model for First Stage Ranking

Learning To Retrieve: How to Train a Dense Retrieval Model Effectively and Efficiently

Enhancing the Ranking Context of Dense Retrieval Methods through Reciprocal Nearest Neighbors

Meta Learning to Rank for Sparsely Supervised Queries

A Large Scale Search Dataset for Unbiased Learning to Rank

Language-model-based ranking for queries on RDF-graphs

Enhancing Dataset Search with Compact Data Snippets

Search Result Reranking with Visual and Structure Information Sources

DeepRank: A New Deep Architecture for Relevance Ranking in Information Retrieval

Semi-supervised document retrieval

Long short-term search session-based document re-ranking model

PairDistill: Pairwise Relevance Distillation for Dense Retrieval

UDAPDR: Unsupervised Domain Adaptation via LLM Prompting and Distillation of Rerankers

Zero-Shot Dense Retrieval with Embeddings from Relevance Feedback

Combining Multiple Supervision for Robust Zero-Shot Dense Retrieval