Enhancing Legal Compliance and Regulation Analysis with Large Language Models

Shabnam Hassani
2024-04-27
Abstract:This research explores the application of Large Language Models (LLMs) for automating the extraction of requirement-related legal content in the food safety domain and checking legal compliance of regulatory artifacts. With Industry 4.0 revolutionizing the food industry and with the General Data Protection Regulation (GDPR) reshaping privacy policies and data processing agreements, there is a growing gap between regulatory analysis and recent technological advancements. This study aims to bridge this gap by leveraging LLMs, namely BERT and GPT models, to accurately classify legal provisions and automate compliance checks. Our findings demonstrate promising results, indicating LLMs' significant potential to enhance legal compliance and regulatory analysis efficiency, notably by reducing manual workload and improving accuracy within reasonable time and financial constraints.
Software Engineering,Artificial Intelligence
What problem does this paper attempt to address?
The problem this paper attempts to address is: With the development of Industry 4.0 and the impact of the General Data Protection Regulation (GDPR) on privacy policies and data processing agreements, there is an increasing gap between legal compliance and regulatory analysis and technological advancements. Specifically, this study aims to automate the extraction of regulatory requirements content in the food safety sector and check the compliance of regulatory documents by leveraging large language models (LLMs) such as BERT and GPT models. The goal of the research is to develop an LLMs-based approach for classifying food safety regulatory requirements and enhancing the effectiveness and interoperability of regulatory document compliance checks, thereby surpassing traditional methods in terms of efficiency, accuracy, and cost. The main research questions include: 1. How accurate is the method for classifying legal requirements? 2. How does the method for classifying legal requirements perform relative to baseline methods? 3. In terms of accuracy in regulatory compliance checks, how do state-of-the-art open-source and closed-source LLMs perform in zero-shot learning and fine-tuning? 4. Compared to traditional sentence-level analysis, how does the text normalization that combines paragraph context and compliance rules improve the performance of compliance checks? 5. What is the practical impact of the proposed method in terms of time and cost?