Abstract:We study a simple model of algorithmic collusion in which Q-learning algorithms are designed in a strategic fashion. We let players (\textit{designers}) choose their exploration policy simultaneously prior to letting their algorithms repeatedly play a prisoner's dilemma. We prove that, in equilibrium, collusive behavior is reached with positive probability. Our numerical simulations indicate symmetry of the equilibria and give insight for how they are affected by a parameter of interest. We also investigate general profiles of exploration policies. We characterize the behavior of the system for extreme profiles (fully greedy and fully explorative) and use numerical simulations and clustering methods to measure the likelihood of collusive behavior in general cases.

What problem does this paper attempt to address?

The paper attempts to address the issue of the possibility and behavioral characteristics of algorithmic collusion in a competitive design environment. Specifically, the authors investigate how two participants using Q-learning algorithms can achieve collusive behavior by choosing different exploration policies in the Prisoner's Dilemma game. The main objectives of the paper include: 1. **Analyzing algorithm behavior under extreme exploration strategies**: The authors first conduct a theoretical analysis of exploration parameters in extreme cases (e.g., ε=0 and ε=1) to explore the behavioral characteristics of the algorithms in these scenarios. 2. **Demonstrating cooperative behavior in Nash equilibrium**: The authors prove that in the Nash equilibrium of the designed game, there must be some form of cooperative behavior. This implies that even in a competitive environment, algorithms can achieve a certain degree of cooperation through specific exploration strategies. 3. **Numerical simulation and clustering analysis**: Through extensive numerical simulations, the authors further study the behavior of algorithms under general exploration strategies and use clustering methods (such as K-means) to detect the probability of spontaneous coupling under different parameter combinations. 4. **Exploring the mechanism of spontaneous coupling**: The authors provide a detailed analysis of the phenomenon of spontaneous coupling in Q-learning algorithms under asynchronous update mechanisms, explaining how this mechanism leads to alternating occurrences of cooperation and betrayal behaviors. In summary, this paper aims to reveal the possibility of algorithmic collusion and its underlying mechanisms in multi-agent reinforcement learning environments through theoretical analysis and numerical simulations. This not only helps to understand the behavior of algorithms in economic competition but also provides a theoretical basis for preventing potential unfair competition.

Algorithmic collusion under competitive design

On Mechanism Underlying Algorithmic Collusion

Robust Algorithmic Collusion

Naive Algorithmic Collusion: When Do Bandit Learners Cooperate and When Do They Compete?

Algorithmic Collusion: Genuine or Spurious?

Algorithmic Collusion Without Threats

Algorithmic Collusion in Assortment Games

Collusive Outcomes Without Collusion

Artificial Intelligence and Spontaneous Collusion

Algorithmic Collusion: Insights from Deep Learning

Algorithmic Collusion: Supra-Competitive Prices via Independent Algorithms

On algorithmic collusion and reward-punishment schemes

Artificial Intelligence and Algorithmic Price Collusion in Two-sided Markets

Algorithmic Collusion in Cournot Duopoly Market: Evidence from Experimental Economics

Adversarial competition and collusion in algorithmic markets

Algorithms, Machine Learning, and Collusion

Algorithmic Collusion in Dynamic Pricing with Deep Reinforcement Learning

Algorithmic Collusion and Price Discrimination: The Over-Usage of Data

Competition among Parallel Contests

Combating Algorithmic Collusion: A Mechanism Design Approach

Robustly Leveraging Collusion in Combinatorial Auctions.