Opponent Cart-Pole Dynamics for Reinforcement Learning of Competing Agents

Huang Xun
DOI: https://doi.org/10.1007/s10409-022-09005-x
IF: 3.5
2022-01-01
Acta Mechanica Sinica
Abstract:In this work, the classical single cart-pole dynamic system is extended to the double cart-pole dynamic system with the inclusion of a competing target, which enables the study of multi-agent deep learning problems at an affordable cost. The corresponding important issues, such as system dynamics, reward function and simultaneous training of opponent agents, are discussed in details. To showcase the system dynamics, a couple of agents are trained and the analysis of the competing results reveals the key pattern for winning the competition. It appears that a defensive agent is always defeated by an offensive agent, albeit the associated neural network has a very limited intelligence. When both agents are defensive, the system dynamics will remain stable and achieve the Nash equilibrium. Overall, the proposed dynamic system could serve a surrogate model and assist the study about how to escape the so-called Thucydides trap.
What problem does this paper attempt to address?