Self-play Decision-making Method of Deep Reinforcement Learning Guided by Behavior Tree under Complex Environment

Shuai Wang,Bo Wang,Xiaochen Xiong
DOI: https://doi.org/10.23919/CCC63176.2024.10662399
2024-07-28
Abstract:With advances in artificial intelligence, military simulation has evolved from a human-to-human exercise to autonomous self-improvement through self-play of reinforcement learning. Unlike air and sea engagements, the complexity of terrain in land battles cannot be ignored. On the 2D map of the real world, we introduce elevation data through grid map to construct basic scenes and apply fuzzy theory to clarify the nuances of different terrain. Based on the latest unmanned combat vehicle data, the land combat agent model is designed, and the depth deterministic strategy gradient (DDPG) algorithm and near end strategy optimization (PPO) algorithm are adopted. Through self-play, the strategy of the vehicle is optimized so that it can effectively adapt to various battlefield scenarios. This method not only enhances the realism of simulation by combining key terrain features, but also significantly improves the strategy capability of autonomous agents. By constantly playing with themselves to perfect their tactics, these agents can better face the unpredictable dynamics of land warfare. Our results show that by incorporating training with DDPG and PPO algorithms guided by behavior trees, the ability of agents to navigate and engage in complex terrain is significantly improved, demonstrating the potential of AI-driven simulation to shape future military strategy and training.
Computer Science,Engineering
What problem does this paper attempt to address?