Towards Replication-Robust Data Markets

Thomas Falconer,Jalal Kazempour,Pierre Pinson
2024-10-01
Abstract:Despite widespread adoption of machine learning throughout industry, many firms face a common challenge: relevant datasets are typically distributed amongst market competitors that are reluctant to share information. Recent works propose data markets to provide monetary incentives for collaborative machine learning, where agents share features with each other and are rewarded based on their contribution to improving the predictions others. These contributions are determined by their relative Shapley value, which is computed by treating features as players and their interactions as a characteristic function game. However, in its standard form, this setup further provides an incentive for agents to replicate their data and act under multiple false identities in order to increase their own revenue and diminish that of others, restricting their use in practice. In this work, we develop a replication-robust data market for supervised learning problems. We adopt Pearl's do-calculus from causal reasoning to refine the characteristic function game by differentiating between observational and interventional conditional probabilities. By doing this, we derive Shapley value-based rewards that are robust to this malicious replication by design, whilst preserving desirable market properties.
General Economics,Computer Science and Game Theory
What problem does this paper attempt to address?