Result Diversification for Legal Case Retrieval

Ruizhe Zhang,Qingyao Ai,Yueyue Wu,Yixiao Ma,Yiqun Liu
DOI: https://doi.org/10.1145/3624918.3625319
2023-01-01
Abstract:Legal case retrieval has received considerable attention in the last decade. As more and more legal documents are collected and stored in digital form, the need for efficient and reliable access to relevant information in large-scale legal databases continues to grow. While most existing studies have focused on differentiating relevant documents from irrelevant ones based on their similarity to the query case, user studies have revealed that similarity is not the sole concern in legal case retrieval. In many instances, users require not only cases that are similar in content but also cases that encompass a broad range of subtopics (i.e., charges) related to the query case. In contrast to open-domain retrieval, such as web search, our research has found that search diversification in legal case retrieval involves a smaller number of highly correlated subtopics. To address this issue, we have constructed a Diversity Legal Retrieval dataset (DLR-dataset) that includes query-charge labels and charge-level relevance labels between the query case and candidate cases. Additionally, we propose a Diversified Legal Case Retrieval Model (DLRM) that simultaneously considers topical relevance and subtopic relationships using a legal relationship graph. Experimental results demonstrate that DLRM outperforms existing diversified search baselines in the field of legal retrieval.
What problem does this paper attempt to address?