Energy Constrained Multi-Agent Reinforcement Learning for Coverage Path Planning

Chenyang Zhao,Juan Liu,Suk-Un Yoon,Xinde Li,Heqing Li,Zhentong Zhang
DOI: https://doi.org/10.1109/IROS55552.2023.10341412
2023-01-01
Abstract:For multi-agent area coverage path planning problem, existing researches regard it as a combination of Traveling Salesman Problem (TSP) and Coverage Path Planning (CPP). However, these approaches have disadvantages of poor observation ability in online phase and high computational cost in offline phase, making it difficult to be applied to energy-constrained Unmanned Aerial Vehicles (UAVs) and adjust strategy dynamically. In this paper, we decompose the task into two sub-problems: multi-agent path planning and sub-region CPP. We model the multi-agent path planning problem as a Collective Markov Decision Process (C-MDP), and design an Energy Constrained Multi-Agent Reinforcement Learning (ECMARL) algorithm based on the centralized training and distributed execution concept. Taking into account energy constraint of UAVs, the UAV propulsion power model is established to measure the energy consumption of UAVs, and load balancing strategy is applied to dynamically allocate target areas for each UAV. If the UAV is under energy-depleted situation, ECMARL can adjust the mission strategy in real time according to environmental information and energy storage conditions of other UAVs. When UAVs reach each sub-region of interest, Back-an-Forth Paths (BFPs) are adopted to solve CPP problem, which can ensure full coverage, optimality and complexity of the sub-problem. Comprehensive theoretical analysis and experiments demonstrate that ECMARL is superior to the traditional offline TSP-CPP strategy in terms of solution quality and computational time, and can effectively deal with the energy-constrained UAVs.
What problem does this paper attempt to address?