Building a relevance feedback corpus for legal information retrieval in the real-case scenario of the Brazilian Chamber of Deputies

Douglas Vitório,Ellen Souza,Lucas Martins,Nádia F. F. da Silva,André Carlos Ponce de Leon de Carvalho,Adriano L. I. Oliveira,Francisco Edmundo de Andrade
DOI: https://doi.org/10.1007/s10579-024-09767-3
2024-08-21
Language Resources and Evaluation
Abstract:The proper functioning of judicial and legislative institutions requires the efficient retrieval of legal documents from extensive datasets. Legal Information Retrieval focuses on investigating how to efficiently handle these datasets, enabling the retrieval of pertinent information from them. Relevance Feedback, an important aspect of Information Retrieval systems, utilizes the relevance information provided by the user to enhance document retrieval for a specific request. However, there is a lack of available corpora containing this information, particularly for the legislative scenario. Thus, this paper presents Ulysses-RFCorpus, a Relevance Feedback corpus for legislative information retrieval, built in the real-case scenario of the Brazilian Chamber of Deputies. To the best of our knowledge, this corpus is the first publicly available of its kind for the Brazilian Portuguese language. It is also the only corpus that contains feedback information for legislative documents, as the other corpora found in the literature primarily focus on judicial texts. We also used the corpus to evaluate the performance of the Brazilian Chamber of Deputies' Information Retrieval system. Thereby, we highlighted the model's strong performance and emphasized the dataset's significance in the field of Legal Information Retrieval.
computer science, interdisciplinary applications
What problem does this paper attempt to address?