Abstract:Motivated by Carbon Emissions Trading Schemes, Treasury Auctions, and Procurement Auctions, which all involve the auctioning of homogeneous multiple units, we consider the problem of learning how to bid in repeated multi-unit pay-as-bid auctions. In each of these auctions, a large number of (identical) items are to be allocated to the largest submitted bids, where the price of each of the winning bids is equal to the bid itself. The problem of learning how to bid in pay-as-bid auctions is challenging due to the combinatorial nature of the action space. We overcome this challenge by focusing on the offline setting, where the bidder optimizes their vector of bids while only having access to the past submitted bids by other bidders. We show that the optimal solution to the offline problem can be obtained using a polynomial time dynamic programming (DP) scheme. We leverage the structure of the DP scheme to design online learning algorithms with polynomial time and space complexity under full information and bandit feedback settings. We achieve an upper bound on regret of $O(M\sqrt{T\log |\mathcal{B}|})$ and $O(M\sqrt{|\mathcal{B}|T\log |\mathcal{B}|})$ respectively, where $M$ is the number of units demanded by the bidder, $T$ is the total number of auctions, and $|\mathcal{B}|$ is the size of the discretized bid space. We accompany these results with a regret lower bound, which match the linear dependency in $M$. Our numerical results suggest that when all agents behave according to our proposed no regret learning algorithms, the resulting market dynamics mainly converge to a welfare maximizing equilibrium where bidders submit uniform bids. Lastly, our experiments demonstrate that the pay-as-bid auction consistently generates significantly higher revenue compared to its popular alternative, the uniform price auction.

Learning to Bid Long-Term: Multi-Agent Reinforcement Learning with Long-Term and Sparse Reward in Repeated Auction Games

Deep Reinforcement Learning for Strategic Bidding in Electricity Markets

Multi-Agent Reinforcement Learning for Long-Term Network Resource Allocation through Auction: a V2X Application

Using Multi-Agent Reinforcement Learning in Auction Simulations

Learning Best Response Policies in Dynamic Auctions via Deep Reinforcement Learning

Optimal Exploration Algorithm of Multi-Agent Reinforcement Learning Methods (Student Abstract)

Infer Your Enemies and Know Yourself, Learning in Real-Time Bidding with Partially Observable Opponents

Deep Reinforcement Learning for Sequential Combinatorial Auctions

Learning in Repeated Multi-Unit Pay-As-Bid Auctions

Applying Opponent Modeling for Automatic Bidding in Online Repeated Auctions

Multi-Agent Learning in Double-side Auctions forPeer-to-peer Energy Trading

Real-Time Bidding with Multi-Agent Reinforcement Learning in Display Advertising

Strategic bidding in freight transport using deep reinforcement learning

Understanding Iterative Combinatorial Auction Designs via Multi-Agent Reinforcement Learning

Auction Design through Multi-Agent Learning in Peer-to-Peer Energy Trading

Using Reinforcement Learning to Validate Empirical Game-Theoretic Analysis: A Continuous Double Auction Study

Approximating Auction Equilibria with Reinforcement Learning

Towards Multi-Agent Reinforcement Learning driven Over-The-Counter Market Simulations

Learning to Incentivize Other Learning Agents

Individual Reward Assisted Multi-Agent Reinforcement Learning.

A Deep Reinforcement Learning Bidding Algorithm on Electricity Market