Abstract:Data markets facilitate decentralized data exchange for applications such as prediction, learning, or inference. The design of these markets is challenged by varying privacy preferences as well as data similarity among data owners. Related works have often overlooked how data similarity impacts pricing and data value through statistical information leakage. We demonstrate that data similarity and privacy preferences are integral to market design and propose a query-response protocol using local differential privacy for a two-party data acquisition mechanism. In our regression data market model, we analyze strategic interactions between privacy-aware owners and the learner as a Stackelberg game over the asked price and privacy factor. Finally, we numerically evaluate how data similarity affects market participation and traded data value.

What problem does this paper attempt to address?

This paper attempts to solve how to find the optimal balance between data privacy and surrogate utility under the condition of data similarity in the regression data market. Specifically, the paper focuses on: 1. **Trade - off between Data Privacy and Utility**: In the regression data market, data owners (i.e., surrogates) have different privacy preferences, and there may be similarities among these data. Such similarities can lead to statistical information leakage, thus affecting the value and pricing of data. Therefore, learners (i.e., data purchasers) need to optimize query signals to extract distributed features while considering the feature value provided by each surrogate and its relevance. 2. **Impact of Privacy - Protection Mechanisms**: To protect privacy, surrogates can adopt techniques such as Local Differential Privacy (LDP). However, these privacy - protection mechanisms introduce noise, thereby reducing the accuracy of the model. Therefore, learners need to design incentive strategies to make surrogates comply with certain privacy requirements while providing high - quality data. 3. **Strategic Interaction and Game - Theoretic Framework**: The paper models the interaction between learners and surrogates as a Stackelberg game, where learners are leaders and surrogates are followers. Through this game structure, the strategic behaviors of surrogates in the face of different privacy budgets and pricing are analyzed, and the existence and uniqueness of the Nash best - response strategy are demonstrated. ### Specific Problem Description - **Impact of Data Similarity on Market Participation and Transaction Data Value**: When the data of multiple surrogates are similar, it may lead to information leakage, which in turn affects market prices and data values. For example, in the labor market, Alice and Bob may have similar data, but Bob has a higher privacy preference, so he may be unwilling to directly share data and instead demands higher compensation. - **How to Design an Effective Incentive Mechanism**: Learners need to design an incentive mechanism so that surrogates are willing to participate and provide high - quality data while ensuring that their privacy needs are met. This involves how to adjust pricing and privacy factors according to the privacy preferences of surrogates. ### Solution The paper proposes a query - response protocol based on local differential privacy and applies it to a two - party data acquisition mechanism. In addition, the paper also verifies the impact of data similarity on market participation and transaction data value through numerical evaluation. Finally, the paper shows how to achieve the optimal trade - off between data privacy and utility through reasonable incentive design in the regression data market. ### Mathematical Formula Representation - **Local Differential Privacy (LDP)**: \[ P[M(x)=y]\leq e^{\epsilon}P[M(x') = y],\quad\forall y\in\text{Dom}(M) \] where \(M(X)\) is a function mapping to discrete values, representing the set of all possible outcomes. - **Utility Function of the Central Surrogate**: \[ S(p;\epsilon)=L(\zeta)\left(\frac{1}{\ln[\alpha\epsilon p + 1]-\beta}-p\sum_{n\in A\setminus\epsilon_n(q_n)>\epsilon}\right) \] where \(L(\zeta)=\frac{1}{|\tilde{L}_{\omega_i}-\tilde{L}_\Omega|}\) represents the improvement in model prediction accuracy after using the features provided by surrogates. Through these methods, the paper aims to solve the privacy protection and utility maximization problems under the condition of data similarity and provides a theoretical basis and solutions for practical applications.

Privacy-Aware Data Acquisition under Data Similarity in Regression Markets

Privacy Risks of Social Interaction Structure: Network Learning in Quadratic Games

Locally Differentially Private Personal Data Markets Using Contextual Dynamic Pricing Mechanism

Wasserstein Markets for Differentially-Private Data

Equilibria of Data Marketplaces with Privacy-Aware Sellers under Endogenous Privacy Costs

On the Differential Private Data Market: Endogenous Evolution, Dynamic Pricing, and Incentive Compatibility

An Incentive Mechanism for Trading Personal Data in Data Markets

Selling Data at an Auction under Privacy Constraints

Two-Party Privacy Games: How Users Perturb When Learners Preempt

Bayesian Regression Markets

Integrated Private Data Trading Systems for Data Marketplaces

A Profit-Maximizing Data Marketplace with Differentially Private Federated Learning under Price Competition

Striking a Balance: An Optimal Mechanism Design for Heterogenous Differentially Private Data Acquisition for Logistic Regression

Online Pricing and Trading of Private Data in Correlated Queries

A Socially Optimal Data Marketplace With Differentially Private Federated Learning

Game Theory Based Correlated Privacy Preserving Analysis in Big Data

Data Sharing Markets

How to Balance Privacy and Money through Pricing Mechanism in Personal Data Market

Pricing and disseminating customer data with privacy awareness

Approximately Optimal Auctions for Selling Privacy when Costs are Correlated with Data

Optimal Trading Mechanism Based on Differential Privacy Protection and Stackelberg Game in Big Data Market