An MA-HPPO Approach for Multi-UAV Data Collection

Zixuan Bai,Jia Shi,Zan Li,Meng Li,Xiaomin Liao
DOI: https://doi.org/10.1109/twc.2024.3458194
IF: 10.4
2024-01-01
IEEE Transactions on Wireless Communications
Abstract:This paper investigates the data collection problem for multi-functional unmanned aerial vehicle (UAV) swarm in a dynamic wireless sensor network (WSN), where sensors have different mobility profiles. For a practical consideration, the observation information of the UAVs is limited, and has the risk of obsolescence, under the limited battery life. The considered optimization problem is formulated as a partially observable Markov decision process (POMDP), which includes the discrete on-off variables of collection, radar, communication and movement, and the continuous variables of the transmit power, UAV flying direction and velocity. For solving the problem, we propose a multi-agent hybrid proximal policy with reward shaping and pre-training optimization algorithm (MAHPPO-RSP). In particular, the proposed algorithm is performed through a two-step training way of supervised learning and reinforcement learning, upon introducing both human experience and autonomous learning. The provided results show that the proposed MAHPPO-RSP algorithm exhibits a stable convergence manner. Furthermore, it obtains a promising trade-off between data collection and energy consumption, outperforming two baseline schemes.
What problem does this paper attempt to address?