Incorporating Retrieval Information into the Truncation of Ranking Lists for Better Legal Search

Yixiao Ma,Qingyao Ai,Yueyue Wu,Yunqiu Shao,Yiqun Liu,Min Zhang,Shaoping Ma
DOI: https://doi.org/10.1145/3477495.3531998
2022-01-01
Abstract:The truncation of ranking lists predicted by retrieval models is vital to ensure users' search experience. Particularly, in specific vertical domains where documents are usually complicated and extensive (e.g., legal cases), the cost of browsing results is much higher than traditional IR tasks (e.g., Web search) and setting a reasonable cut-off position is quite necessary. While it is straightforward to apply existing result list truncation approaches to legal case retrieval, the effectiveness of these methods is limited because they only focus on simple document statistics and usually fail to capture the context information of documents in the ranking list. These existing efforts also treat result list truncation as an isolated task instead of a component in the entire ranking process, limiting the usage of truncation in practical systems. To tackle these limitations, we propose LeCut, a ranking list truncation model for legal case retrieval. LeCut utilizes contextual features of the retrieval task to capture the semantic-level similarity between documents and decides the best cut-off position with attention mechanisms. We further propose a Joint Optimization of Truncation and Reranking (JOTR) framework based on LeCut to improve the performance of truncation and retrieval tasks simultaneously. Comparison against competitive baselines on public benchmark datasets demonstrates the effectiveness of LeCut and JOTR. A case study is conducted to visualize the cut-off positions of LeCut and the process of how JOTR improves both retrieval and truncation tasks.
What problem does this paper attempt to address?