An Online Learning Theory of Brokerage

Nataša Bolić,Tommaso Cesari,Roberto Colomboni
2023-10-19
Abstract:We investigate brokerage between traders from an online learning perspective. At any round $t$, two traders arrive with their private valuations, and the broker proposes a trading price. Unlike other bilateral trade problems already studied in the online learning literature, we focus on the case where there are no designated buyer and seller roles: each trader will attempt to either buy or sell depending on the current price of the good. We assume the agents' valuations are drawn i.i.d. from a fixed but unknown distribution. If the distribution admits a density bounded by some constant $M$, then, for any time horizon $T$: $\bullet$ If the agents' valuations are revealed after each interaction, we provide an algorithm achieving regret $M \log T$ and show this rate is optimal, up to constant factors. $\bullet$ If only their willingness to sell or buy at the proposed price is revealed after each interaction, we provide an algorithm achieving regret $\sqrt{M T}$ and show this rate is optimal, up to constant factors. Finally, if we drop the bounded density assumption, we show that the optimal rate degrades to $\sqrt{T}$ in the first case, and the problem becomes unlearnable in the second.
Machine Learning
What problem does this paper attempt to address?
The paper primarily explores the problem of brokerage from the perspective of online learning, particularly in bilateral markets where there are no explicit roles of buyers and sellers. The core objective of the research is to design algorithms to minimize regret and to theoretically prove the effectiveness of these algorithms. Specifically, the paper focuses on scenarios where, in each round of trading, two traders arrive with private valuations, and the broker needs to propose a trading price. Unlike traditional bilateral trade problems, the traders here can be both buyers and sellers depending on the current price of the commodity. The paper assumes that the traders' valuations are drawn independently and identically distributed (i.i.d.) from some fixed but unknown distribution and mainly focuses on two cases: 1. **Full Feedback**: After each round of trading, the broker can observe the actual valuations of the two traders. - Under the condition that the density function is bounded by \(M\), the paper provides an algorithm that achieves a regret bound of \(O(M\log{T})\) and proves that this is optimal (i.e., it proves a matching lower bound of \(\Omega(M\log{T})\)). - If the density function constraint is removed, the optimal regret bound degrades to \(O(\sqrt{T})\). 2. **Two-Bit Feedback**: After each round of trading, the broker can only observe whether each trader is willing to trade at the proposed price. - Under the condition that the density function is bounded by \(M\), the paper also provides an algorithm that achieves a regret bound of \(O(\sqrt{MT})\) and proves that this is optimal (i.e., it proves a matching lower bound of \(\Omega(\sqrt{MT})\)). - If the density function constraint is removed, the problem becomes unlearnable. In summary, the main contributions of the paper include: - Designing different algorithms for optimal regret bounds under different feedback conditions. - Theoretically proving the performance bounds of these algorithms. - Notably, the paper demonstrates that under the constraint of the density function, a logarithmic regret bound can be achieved, which is a result not previously attained in the literature.