Abstract:We investigate brokerage between traders from an online learning perspective. At any round $t$, two traders arrive with their private valuations, and the broker proposes a trading price. Unlike other bilateral trade problems already studied in the online learning literature, we focus on the case where there are no designated buyer and seller roles: each trader will attempt to either buy or sell depending on the current price of the good. We assume the agents' valuations are drawn i.i.d. from a fixed but unknown distribution. If the distribution admits a density bounded by some constant $M$, then, for any time horizon $T$: $\bullet$ If the agents' valuations are revealed after each interaction, we provide an algorithm achieving regret $M \log T$ and show this rate is optimal, up to constant factors. $\bullet$ If only their willingness to sell or buy at the proposed price is revealed after each interaction, we provide an algorithm achieving regret $\sqrt{M T}$ and show this rate is optimal, up to constant factors. Finally, if we drop the bounded density assumption, we show that the optimal rate degrades to $\sqrt{T}$ in the first case, and the problem becomes unlearnable in the second.

What problem does this paper attempt to address?

The paper primarily explores the problem of brokerage from the perspective of online learning, particularly in bilateral markets where there are no explicit roles of buyers and sellers. The core objective of the research is to design algorithms to minimize regret and to theoretically prove the effectiveness of these algorithms. Specifically, the paper focuses on scenarios where, in each round of trading, two traders arrive with private valuations, and the broker needs to propose a trading price. Unlike traditional bilateral trade problems, the traders here can be both buyers and sellers depending on the current price of the commodity. The paper assumes that the traders' valuations are drawn independently and identically distributed (i.i.d.) from some fixed but unknown distribution and mainly focuses on two cases: 1. **Full Feedback**: After each round of trading, the broker can observe the actual valuations of the two traders. - Under the condition that the density function is bounded by $M$, the paper provides an algorithm that achieves a regret bound of $O(M\log{T})$ and proves that this is optimal (i.e., it proves a matching lower bound of $\Omega(M\log{T})$). - If the density function constraint is removed, the optimal regret bound degrades to $O(\sqrt{T})$. 2. **Two-Bit Feedback**: After each round of trading, the broker can only observe whether each trader is willing to trade at the proposed price. - Under the condition that the density function is bounded by $M$, the paper also provides an algorithm that achieves a regret bound of $O(\sqrt{MT})$ and proves that this is optimal (i.e., it proves a matching lower bound of $\Omega(\sqrt{MT})$). - If the density function constraint is removed, the problem becomes unlearnable. In summary, the main contributions of the paper include: - Designing different algorithms for optimal regret bounds under different feedback conditions. - Theoretically proving the performance bounds of these algorithms. - Notably, the paper demonstrates that under the constraint of the density function, a logarithmic regret bound can be achieved, which is a result not previously attained in the literature.

An Online Learning Theory of Brokerage

A Contextual Online Learning Theory of Brokerage

Trading Volume Maximization with Online Learning

Fair Online Bilateral Trade

Feature-Based Online Bilateral Trade

Online Learning and Pricing for Multiple Products with Reference Price Effects

No-Regret Learning in Bilateral Trade via Global Budget Balance

Strategic Learning and Trading in Broker-Mediated Markets

Online Learning for Equilibrium Pricing in Markets under Incomplete Information

Online Learning in Betting Markets: Profit versus Prediction

Nash Equilibrium between Brokers and Traders

Online Optimization Algorithms in Repeated Price Competition: Equilibrium Learning and Algorithmic Collusion

Gains-from-Trade in Bilateral Trade with a Broker

New Perspectives in Online Contract Design

No-Regret Learning for Stackelberg Equilibrium Computation in Newsvendor Pricing Games

A simple learning agent interacting with an agent-based market model

Online Learning in Supply-Chain Games

Learning to Price Homogeneous Data

Selling Joint Ads: A Regret Minimization Perspective

Online Learning and Profit Maximization from Revealed Preferences

Brokers or Dealers? Trading Intermediation Across Markets and over Time