An Improved PPO for Multiple Unmanned Aerial Vehicles

Xue Bai,Chengxuan Lu,Qihao Bao,Shansheng Zhu,Shaojie Xia
DOI: https://doi.org/10.1088/1742-6596/1757/1/012156
2021-01-01
Journal of Physics: Conference Series
Abstract:Abstract In recent years, multi-agent reinforcement learning (MARL) has been applied widely, especially in large multi-role games such as StarCraft and unmanned aerial vehicles (UAVs) combat simulations. However, MARL is faced with challenges regarding fast convergence and efficient cooperation. In a multi-agent scenario, on the one hand, when a fully centralized network model is adopted, it is difficult for the model to converge due to the huge action space; on the other hand, it is difficult for a decentralized model to cooperate and achieve global optimization. To jointly control multiple agents, we propose an improved PPO algorithm by combining a centralized network and decentralized networks. Our method not only reduces the action space and accelerates the convergence, but also introduces more diversity for agents’ decision-making.
What problem does this paper attempt to address?