GLARE: Guided LexRank for Advanced Retrieval in Legal Analysis

Fabio Gregório,Rafaela Castro,Kele Belloze,Rui Pedro Lopes,Eduardo Bezerra
2024-09-10
Abstract:The Brazilian Constitution, known as the Citizen's Charter, provides mechanisms for citizens to petition the Judiciary, including the so-called special appeal. This specific type of appeal aims to standardize the legal interpretation of Brazilian legislation in cases where the decision contradicts federal laws. The handling of special appeals is a daily task in the Judiciary, regularly presenting significant demands in its courts. We propose a new method called GLARE, based on unsupervised machine learning, to help the legal analyst classify a special appeal on a topic from a list made available by the National Court of Brazil (STJ). As part of this method, we propose a modification of the graph-based LexRank algorithm, which we call Guided LexRank. This algorithm generates the summary of a special appeal. The degree of similarity between the generated summary and different topics is evaluated using the BM25 algorithm. As a result, the method presents a ranking of themes most appropriate to the analyzed special appeal. The proposed method does not require prior labeling of the text to be evaluated and eliminates the need for large volumes of data to train a model. We evaluate the effectiveness of the method by applying it to a special appeal corpus previously classified by human experts.
Information Retrieval,Machine Learning
What problem does this paper attempt to address?
The paper attempts to address the problem of efficiently classifying special appeals into appropriate repetitive themes within the Brazilian judicial system. Specifically, it proposes an unsupervised machine learning method called GLARE (Guided LexRank for Advanced Retrieval in Legal Analysis) to assist legal analysts in categorizing special appeals into a list of themes provided by the Brazilian Superior Court of Justice (STJ). ### Background and Challenges 1. **Legal Background**: - The Brazilian Constitution allows citizens to challenge judicial decisions through special appeals to standardize the interpretation of Brazilian law. - Special appeals typically involve decisions from federal regional courts or state courts, the Federal District and Territories Court, which may violate federal laws or treaties, or interpret federal laws differently from other courts. 2. **Practical Needs**: - A large number of special appeals are submitted to the STJ each year, creating a significant workload for the judicial system. - Classifying special appeals is a complex task that requires comprehensive analysis of the appeal content and comparison with the STJ's repetitive themes to find similar themes. 3. **Technical Challenges**: - The text length of special appeals is usually around 4,000 words, while the text length of repetitive themes is only about 40 words, requiring legal analysts to have good summarization skills. - The scarcity of training data limits the application of supervised learning methods, necessitating an unsupervised approach to address this issue. ### Solution The paper proposes the GLARE method, which mainly includes the following steps: 1. **Text Preprocessing**: - Extract core paragraphs from special appeals and repetitive themes, removing irrelevant words, punctuation, and numerical patterns. 2. **Summary Generation**: - Use an improved LexRank algorithm (Guided LexRank) to generate summaries of special appeals. This algorithm guides the summary generation by introducing external factors (i.e., the similarity between sentences and repetitive themes) to improve the quality of the summaries. 3. **Similarity Evaluation**: - Use the BM25 algorithm to evaluate the similarity between the generated summaries and each repetitive theme, generating a list of themes ranked by similarity. ### Method Advantages - **Unsupervised Learning**: Does not require pre-labeled training data, suitable for situations with scarce data. - **Efficiency**: Capable of processing a large number of special appeals in a short time, reducing the processing time for the judicial system. - **Accuracy**: Experimental results show that the GLARE method has a much higher accuracy in classifying special appeals compared to existing baseline methods. ### Experimental Results - In experiments, the GLARE method was able to correctly suggest the themes for about 76% of special appeals, significantly outperforming existing supervised learning methods and the TRF2 solution. - Even with a small amount of data for specific themes, the GLARE method still performed better than supervised learning models. ### Main Contributions 1. **Datasets**: Constructed and publicly released two datasets, one consisting of special appeals and the other of repetitive themes. 2. **Method**: Proposed an unsupervised method that effectively classifies special appeals into appropriate repetitive themes. Through these contributions, the GLARE method provides a new solution to improve the efficiency and accuracy of the Brazilian judicial system.