An Economic Solution to Copyright Challenges of Generative AI

Jiachen T. Wang,Zhun Deng,Hiroaki Chiba-Okabe,Boaz Barak,Weijie J. Su
2024-04-25
Abstract:Generative artificial intelligence (AI) systems are trained on large data corpora to generate new pieces of text, images, videos, and other media. There is growing concern that such systems may infringe on the copyright interests of training data contributors. To address the copyright challenges of generative AI, we propose a framework that compensates copyright owners proportionally to their contributions to the creation of AI-generated content. The metric for contributions is quantitatively determined by leveraging the probabilistic nature of modern generative AI models and using techniques from cooperative game theory in economics. This framework enables a platform where AI developers benefit from access to high-quality training data, thus improving model performance. Meanwhile, copyright owners receive fair compensation, driving the continued provision of relevant data for generative model training. Experiments demonstrate that our framework successfully identifies the most relevant data sources used in artwork generation, ensuring a fair and interpretable distribution of revenues among copyright owners.
Machine Learning,General Economics,Methodology
What problem does this paper attempt to address?
This paper attempts to address the issue of potential copyright infringement by generative artificial intelligence (AI) systems during the training process. Specifically, generative AI systems typically require large amounts of data for training, which may include text, images, videos, and other multimedia content. However, this data often contains copyrighted material, which could infringe on the rights of copyright holders during the training process. To solve this problem, the paper proposes a framework to compensate copyright holders by quantifying their contributions. The core idea of this framework is to compensate copyright holders based on their contribution proportion to the AI-generated content. Specifically, the paper utilizes the probabilistic nature of modern generative AI models and cooperative game theory from economics to determine the quantification metrics of contributions. This approach not only allows AI developers to benefit from high-quality training data and improve model performance but also ensures that copyright holders receive fair compensation, thereby promoting the continuous provision of relevant data. The main contribution of the paper is to provide a solution that neither modifies the inference process nor compromises the full functionality of the generative model. Through a reasonable benefit distribution mechanism, it achieves mutually beneficial cooperation between AI developers and copyright holders. Experimental results show that this framework can successfully identify relevant data sources used for generating artistic works, ensuring that the distribution of benefits among copyright holders is both fair and interpretable.