Population-Invariant MADRL for AoI-Aware UAV Trajectory Design and Communication Scheduling in Wireless Sensor Networks

Xuanhan Zhou,Jun Xiong,Haitao Zhao,Chao Yan,Haijun Wang,Jibo Wei
DOI: https://doi.org/10.1109/jiot.2024.3474926
IF: 10.6
2024-01-01
IEEE Internet of Things Journal
Abstract:Unmanned aerial vehicles (UAVs) are recognized as effective data collectors for wireless sensor networks. The age of information (AoI), a metric indicating data freshness, is crucial for decision-making in time-sensitive applications. It can be significantly reduced by jointly optimizing UAV trajectories and communication scheduling of sensor nodes (SNs). However, rapid changes in the environment make it challenging to pre-design UAV trajectories and communication scheduling decisions using traditional methods, especially when central controllers are absent and the numbers of UAVs and SNs vary. In this paper, we propose hypernetwork-based QMIX (HyperQMIX), a population-invariant multi-agent deep reinforcement learning (MADRL) algorithm capable of transferring policies across tasks with varying population sizes. Firstly, we design neural network modules adaptable to varying input and output dimensions, facilitated by parameter generation through a hypernetwork. Then, HyperQMIX leverages these modules to process fluctuations in state and action dimensions. This approach ensures that the network structure remains consistent regardless of population sizes, thereby enhancing the algorithm’s scalability. Extensive simulations demonstrate that HyperQMIX significantly outperforms state-of-the-art algorithms in terms of learning efficiency and converged performance. Moreover, agents pre-trained with HyperQMIX perform well in tasks of different population sizes without additional training. Fine-tuning these models achieves performance comparable to training from scratch.
What problem does this paper attempt to address?