Decomposition-based Multi-Agent Distributional Reinforcement Learning for Task-Oriented UAV Collaboration with Noisy Rewards

Wei Geng,Baidi Xiao,Rongpeng Li,Ning Wei,Zhifeng Zhao,Honggang Zhang
DOI: https://doi.org/10.1109/wcsp58612.2023.10404257
2023-01-01
Abstract:Collaborated unmanned aerial vehicles (UAVs) are often deployed to perform complex tasks on the basis of multi-agent reinforcement learning (MARL). However, the environmental disturbance, commonly leading to noisy observations (e.g., rewards, states), could significantly shape the performance of task-oriented UAV collaboration. Therefore, it becomes imperative to revolutionize the design of MARL, so as to capably ameliorate the annoying impact of noisy rewards. In this paper, we propose a novel decomposition-based multi-agent distributional RL method by approximating the globally shared noisy reward by a Gaussian mixture model and decomposing it into the combination of individual distributional local rewards, with which each agent can be updated locally through distributional RL. Besides, the optimality of the distributional decomposition is theoretically validated, while the design of loss functions is carefully calibrated to tackle the decomposition ambiguity. We also verify the effectiveness of the proposed method through extensive simulation experiments with noisy rewards.
What problem does this paper attempt to address?