Adaptive Optimal Control via Q-Learning for Multi-Agent Pursuit-Evasion Games

Xu Dong,Huaguang Zhang,Zhongyang Ming
DOI: https://doi.org/10.1109/tcsii.2024.3354120
2024-01-01
Abstract:The multi-agent pursuit-evasion (PE) games are solved in this work to find the best possible strategic options for every participant. In these games, numerous pursuers aim to catch numerous evaders which are trying to escape arrest. In order to determine distributed control rules for each agent, a graph-theoretic technique is used to examine how the agents with restricted sensing capabilities interact with one another. Further, the online and real-time solution of Hamilton-Jacobi-Bellman (HJB) equations is necessary to achieve the optimal actions. By introducing the Q-learning technique’s policy iteration, we are able to solve this issue. Once capture, the game is over. The simulation results are shown to demonstrate the viability of the suggested techniques.
engineering, electrical & electronic
What problem does this paper attempt to address?