A Survey of Data Pricing for Data Marketplaces

Mengxiao Zhang,Fernando Beltran,Jiamou Liu
2023-03-07
Abstract:A data marketplace is an online venue that brings data owners, data brokers, and data consumers together and facilitates commoditisation of data amongst them. Data pricing, as a key function of a data marketplace, demands quantifying the monetary value of data. A considerable number of studies on data pricing can be found in literature. This paper attempts to comprehensively review the state-of-the-art on existing data pricing studies to provide a general understanding of this emerging research area. Our key contribution lies in a new taxonomy of data pricing studies that unifies different attributes determining data prices. The basis of our framework categorises these studies by the kind of market structure, be it sell-side, buy-side, or two-sided. Then in a sell-side market, the studies are further divided by query type, which defines the way a data consumer accesses data, while in a buy-side market, the studies are divided according to privacy notion, which defines the way to quantify privacy of data owners. In a two-sided market, both privacy notion and query type are used as criteria. We systematically examine the studies falling into each category in our taxonomy. Lastly, we discuss gaps within the existing research and define future research directions.
Computer Science and Game Theory,Artificial Intelligence,Databases
What problem does this paper attempt to address?
The paper primarily explores the issue of data pricing in data markets, aiming to provide a comprehensive review and synthesis of existing research on data pricing to enhance understanding of this emerging research field. Its core contribution lies in proposing a new classification system for data pricing research, which can unify the impact of different factors on data prices. Specifically, the paper constructs its classification framework through the following three aspects: 1. **Market Structure**: Divides data markets into three types: seller markets, buyer markets, and bilateral markets. The method of data value assessment varies under each market structure. - Seller Market: Integrates data from multiple sources and sells it to data consumers. - Buyer Market: Allows individuals and organizations to profit by selling their internal data. - Bilateral Market: Combines the above two markets, enabling data acquisition from data owners and direct data sales to data consumers. 2. **Query Type**: In seller markets, further classification is based on the way data consumers access data (i.e., query type). This mainly includes direct access to datasets (null query), one-time queries for specific statistical information (such as average or sum), and general queries (which can perform multiple queries and return results containing multiple rows and columns). 3. **Privacy Concept**: In buyer markets, classification is based on the method of quantifying data owners' privacy loss. This includes methods with strictly defined privacy loss such as differential privacy, as well as other privacy definition methods tailored to specific application scenarios or data types. The paper also systematically reviews research in each category and discusses gaps in existing research, proposing future research directions. These include how to leverage machine learning and reinforcement learning to solve complex data pricing problems, and how to design new pricing schemes for unstructured data (such as natural language). Additionally, the paper reviews research on the value of privacy in the fields of economics and management.