A Legal Framework for Natural Language Processing Model Training in Portugal

Rúben Almeida,Evelin Amorim
2024-05-01
Abstract:Recent advances in deep learning have promoted the advent of many computational systems capable of performing intelligent actions that, until then, were restricted to the human intellect. In the particular case of human languages, these advances allowed the introduction of applications like ChatGPT that are capable of generating coherent text without being explicitly programmed to do so. Instead, these models use large volumes of textual data to learn meaningful representations of human languages. Associated with these advances, concerns about copyright and data privacy infringements caused by these applications have emerged. Despite these concerns, the pace at which new natural language processing applications continued to be developed largely outperformed the introduction of new regulations. Today, communication barriers between legal experts and computer scientists motivate many unintentional legal infringements during the development of such applications. In this paper, a multidisciplinary team intends to bridge this communication gap and promote more compliant Portuguese NLP research by presenting a series of everyday NLP use cases, while highlighting the Portuguese legislation that may arise during its development.
Computation and Language,Emerging Technologies
What problem does this paper attempt to address?
The paper focuses on legal issues in natural language processing (NLP) model training within the legal framework of Portugal. With the development of NLP technology driven by deep learning, concerns about copyright and data privacy have emerged. Despite these concerns, the speed of new regulations cannot keep up with the advancement of NLP applications, resulting in a legal vacuum. The paper aims to promote communication between computer scientists and legal experts by outlining the legal issues that may be encountered in daily NLP challenges, in order to enhance compliance in Portuguese NLP research. The paper covers relevant legal background, the current state of NLP in Portugal, an overview of the legal system, licensing agreements, and case studies, with the goal of providing legal references for Portugal and other European Union countries.