CLERC: A Dataset for Legal Case Retrieval and Retrieval-Augmented Analysis Generation

Abe Bohan Hou,Orion Weller,Guanghui Qin,Eugene Yang,Dawn Lawrie,Nils Holzenberger,Andrew Blair-Stanek,Benjamin Van Durme
2024-06-27
Abstract:Legal professionals need to write analyses that rely on citations to relevant precedents, i.e., previous case decisions. Intelligent systems assisting legal professionals in writing such documents provide great benefits but are challenging to design. Such systems need to help locate, summarize, and reason over salient precedents in order to be useful. To enable systems for such tasks, we work with legal professionals to transform a large open-source legal corpus into a dataset supporting two important backbone tasks: information retrieval (IR) and retrieval-augmented generation (RAG). This dataset CLERC (Case Law Evaluation Retrieval Corpus), is constructed for training and evaluating models on their ability to (1) find corresponding citations for a given piece of legal analysis and to (2) compile the text of these citations (as well as previous context) into a cogent analysis that supports a reasoning goal. We benchmark state-of-the-art models on CLERC, showing that current approaches still struggle: GPT-4o generates analyses with the highest ROUGE F-scores but hallucinates the most, while zero-shot IR models only achieve 48.3% recall@1000.
Computation and Language,Computers and Society
What problem does this paper attempt to address?
This paper attempts to solve two core problems faced in legal analysis writing: 1. **Case Retrieval**: Legal professionals need to cite relevant precedents (i.e., previous case judgments) when writing legal analyses. Finding relevant documents from millions of cases and weaving them into a persuasive whole requires a great deal of time and effort. Therefore, this paper constructs a dataset named CLERC (CaseLaw Evaluation and Retrieval Corpus), aiming to help train and evaluate the model's ability to find corresponding cited precedents when given a piece of legal analysis text. 2. **Retrieval - Based Generation**: In addition to finding relevant precedents, legal professionals also need to combine the content of these precedents with the context to generate legal analyses that are coherent and support specific arguments. The CLERC dataset also supports evaluating the model's ability to generate a coherent analysis in support of a certain reasoning goal based on compiling these cited precedents (and their previous contexts). Specifically, the CLERC dataset contains two main tasks: - **Information Retrieval (IR)**: Given a piece of legal analysis text, the model needs to find relevant precedent citations. - **Retrieval - Augmented Generation (RAG)**: The model needs to generate legal analysis text that supports specific arguments based on the found relevant precedents and their contexts. Through these two tasks, the CLERC dataset aims to improve the performance of legal intelligence systems, enabling them to more effectively assist legal professionals in case retrieval and legal analysis writing.