Abstract:Answering questions related to the legal domain is a complex task, primarily due to the intricate nature and diverse range of legal document systems. Providing an accurate answer to a legal query typically necessitates specialized knowledge in the relevant domain, which makes this task all the more challenging, even for human experts. Question answering (QA) systems are designed to generate answers to questions asked in human languages. QA uses natural language processing to understand questions and search through information to find relevant answers. QA has various practical applications, including customer service, education, research, and cross-lingual communication. However, QA faces challenges such as improving natural language understanding and handling complex and ambiguous questions. Answering questions related to the legal domain is a complex task, primarily due to the intricate nature and diverse range of legal document systems. Providing an accurate answer to a legal query typically necessitates specialized knowledge in the relevant domain, which makes this task all the more challenging, even for human experts. At this time, there is a lack of surveys that discuss legal question answering. To address this problem, we provide a comprehensive survey that reviews 14 benchmark datasets for question-answering in the legal field as well as presents a comprehensive review of the state-of-the-art Legal Question Answering deep learning models. We cover the different architectures and techniques used in these studies and the performance and limitations of these models. Moreover, we have established a public GitHub repository where we regularly upload the most recent articles, open data, and source code. The repository is available at: \url{<a class="link-external link-https" href="https://github.com/abdoelsayed2016/Legal-Question-Answering-Review" rel="external noopener nofollow">this https URL</a>}.

LEGAL-UQA: A Low-Resource Urdu-English Dataset for Legal Question Answering

UQA: Corpus for Urdu Question Answering

UQuAD1.0: Development of an Urdu Question Answering Training Data for Machine Reading Comprehension

LeDQA: A Chinese Legal Case Document-based Question Answering Dataset

Interpretable Long-Form Legal Question Answering with Retrieval-Augmented Large Language Models

Exploring the State of the Art in Legal QA Systems

FALQU: Finding Answers to Legal Questions

Experimenting with Legal AI Solutions: The Case of Question-Answering for Access to Justice

ArabLegalEval: A Multitask Benchmark for Assessing Arabic Legal Knowledge in Large Language Models

Dataset and Benchmark for Urdu Natural Scenes Text Detection, Recognition and Visual Question Answering

A Benchmark Dataset with Larger Context for Non-Factoid Question Answering over Islamic Text

Can LLM Generate Culturally Relevant Commonsense QA Data? Case Study in Indonesian and Sundanese

NativQA: Multilingual Culturally-Aligned Natural Query for LLMs

OMoS-QA: A Dataset for Cross-Lingual Extractive Question Answering in a German Migration Context

SynDARin: Synthesising Datasets for Automated Reasoning in Low-Resource Languages

RJUA-QA: A Comprehensive QA Dataset for Urology

JEC-QA: A Legal-Domain Question Answering Dataset

Evaluating LLMs on Document-Based QA: Exact Answer Selection and Numerical Extraction using Cogtale dataset

ArabicaQA: A Comprehensive Dataset for Arabic Question Answering

FairytaleQA Translated: Enabling Educational Question and Answer Generation in Less-Resourced Languages

AraLegal-BERT: A pretrained language model for Arabic Legal text