Towards Understanding Contracts Grammar: A Large Language Model-Based Extractive Question-Answering Approach

Gokul Rejithkumar,P. Anish,S. Ghaisas
DOI: https://doi.org/10.1109/RE59067.2024.00037
2024-06-24
Abstract:Software Engineering (SE) contracts play a pivotal role in Information Technology Outsourcing (ITO) projects. The obligations in SE contracts are known to be a useful source for deriving software requirements, thereby contributing to the overall Software Development Life Cycle (SDLC). Making sense of contractual obligations is an important first step in successfully executing software projects. This includes building compliant systems, meeting delivery deadlines, avoiding heavy penalties, and steering clear of expensive litigations. In this work, we present an approach to capture the essence of a contractual clause by extracting its Contracts Grammar. Through an exploratory study, we first identify the constituents of Contracts Grammar. Subsequently, we experiment with multiple approaches for the automated extraction of these constituents, including extractive question-answering, token classification, text-to-text generation, prompting, and regular expressions. The question-answering based approach performed the best in terms of high average ROUGE-L score of 0.81, and faster inference times. The work presented in this paper is a part of the Contracts Governance System (CGS) and is in the process of deployment within a large IT vendor organization.
Computer Science
What problem does this paper attempt to address?