Parallel Intelligent Command Decision-Making Technology Based on Combat Prior Knowledge and Reinforcement Learning Algorithm

Bojian Tang,Yuxiang Sun,Jiahui Yu,Tao Jin,Xianzhong Zhou
DOI: https://doi.org/10.1109/iccsi53130.2021.9736221
2021-01-01
Abstract:Intelligent command decision-making technology based on enhanced learning algorithm has gradually become a new trend in the field of intelligent chess. However, in the complex chess environment, relying solely on intensive learning algorithms, it is difficult to gain effective experience quickly in the initial stage of the opening, convergence speed is very slow. We present the WKP-BCQ (War-Knowledge-Prior Batch-constrained deep Q-learning) algorithm based on prior knowledge to solve the problem of long-term ineffective exploration of agents. Our model generate a knowledge-based cache in the same reward environment according to the rules of experts and the actual per-person competition data. Based on this cache library, combined with discrete BCQ algorithms to train directly from cached data, an effective strategy is generated to solve the cold start problem of relying solely on intensive learning. The proposed WKP-BCQ algorithm is verified by tactical-level chess projection and confrontation system, and the experiment proves that the algorithm can efficiently generate intelligent decision suggestions and have stronger intelligence.
What problem does this paper attempt to address?