Reinforcement Learning with Value Function Decomposition for Hierarchical Multi-Agent Consensus Control

Xiaoxia Zhu
DOI: https://doi.org/10.3390/math12193062
IF: 2.4
2024-09-30
Mathematics
Abstract:A hierarchical consensus control algorithm based on value function decomposition is proposed for hierarchical multi-agent systems. To implement the consensus control algorithm, the reward function of the multi-agent systems can be decomposed, and two value functions can be obtained by analyzing the communication content and the corresponding control objective of each layer in the hierarchical multi-agent systems. Therefore, for each agent in the systems, a dual-critic network and a single-actor network structure are applied to realize the objective of each layer. In addition, the target network is introduced to prevent overfitting in the critic network and improve the stability of the online learning process. During the updating of network parameters, a soft updating mechanism and experience replay buffer are introduced to slow down the update rate of the network and improve the utilization rate of training data. The convergence and stability of the consensus control algorithm with the soft updating mechanism are analyzed theoretically. Finally, the correctness of the theoretical analysis and the effectiveness of the algorithm were verified by two experiments.
mathematics
What problem does this paper attempt to address?