Abstract:Legal case retrieval has received increasing attention in recent years. However, compared to ad-hoc retrieval tasks, legal case retrieval has its unique challenges. First, case documents are rather lengthy and contain complex legal structures. Therefore, it is difficult for most existing dense retrieval models to encode an entire document and capture its inherent complex structure information. Most existing methods simply truncate part of the document content to meet the input length limit of PLMs, which will lead to information loss. Additionally, the definition of relevance in the legal domain differs from that in the general domain. Previous semantic-based or lexical-based methods fail to provide a comprehensive understanding of the relevance of legal cases. In this paper, we propose a S tructured L egal case R etrieval (SLR) framework, which incorporates internal and external structural information to address the above two challenges. Specifically, to avoid the truncation of long legal documents, the internal structural information, which is the organization pattern of legal documents, can be utilized to split a case document into segments. By dividing the document-level semantic matching task into segment-level subtasks, SLR can separately process segments using different methods based on the characteristic of each segment. In this way, the key elements of a case document can be highlighted without losing other content information. Secondly, towards a better understanding of relevance in the legal domain, we investigate the connections between criminal charges appearing in large-scale case corpus to generate a charge-wise relation graph. Then, the similarity between criminal charges can be pre-computed as the external structural information to enhance the recognition of relevant cases. Finally, a learning-to-rank algorithm integrates the features collected from internal and external structures to output the final retrieval results. Experimental results on public legal case retrieval benchmarks demonstrate the superior effectiveness of SLR over existing state-of-the-art baselines, including traditional bag-of-words and neural-based methods. Furthermore, we conduct a case study to visualize how the proposed model focuses on key elements and improves retrieval performance.

SM-BERT-CR: a deep learning approach for case law retrieval with supporting model

BERT-PLI: Modeling Paragraph-Level Interactions for Legal Case Retrieval

BERT_LF: A Similar Case Retrieval Method Based on Legal Facts

Attentive Deep Neural Networks for Legal Document Retrieval

Enhancing Legal Document Retrieval: A Multi-Phase Approach with Large Language Models

Improving Vietnamese Legal Document Retrieval using Synthetic Data

Deep Text Understanding Model for Similar Case Matching

Exploiting LLMs' Reasoning Capability to Infer Implicit Concepts in Legal Information Retrieval

Legal Feature Enhanced Semantic Matching Network for Similar Case Matching

Iterative Self-Supervised Learning for Legal Similar Case Retrieval

Analyzing Vietnamese Legal Questions Using Deep Neural Networks with Biaffine Classifiers

Incorporating Structural Information into Legal Case Retrieval

THUIR@COLIEE-2020: Leveraging Semantic Understanding and Exact Matching for Legal Case Retrieval and Entailment

LawRec: Automatic Recommendation of Legal Provisions Based on Legal Text Analysis

Boosting legal case retrieval by query content selection with large language models

Understand Legal Documents with Contextualized Large Language Models

Deep Learning for Content-Based Image Retrieval: A Comprehensive Study

Leveraging Large Language Models for Relevance Judgments in Legal Case Retrieval

A Small Claims Court for the NLP: Judging Legal Text Classification Strategies With Small Datasets

Sublanguage: A Serious Issue Affects Pretrained Models in Legal Domain

Legal Element-oriented Modeling with Multi-view Contrastive Learning for Legal Case Retrieval