Abstract:In a multi-agent system, agents share their local observations to gain global situational awareness for decision making and collaboration using a message passing system. When to send a message, how to encode a message, and how to leverage the received messages directly affect the effectiveness of the collaboration among agents. When training a multi-agent cooperative game using reinforcement learning (RL), the message passing system needs to be optimized together with the agent policies. This consequently increases the model's complexity and poses significant challenges to the convergence and performance of learning. To address this issue, we propose the Belief-map Assisted Multi-agent System (BAMS), which leverages a neuro-symbolic belief map to enhance training. The belief map decodes the agent's hidden state to provide a symbolic representation of the agent's understanding of the environment and other agent's status. The simplicity of symbolic representation allows the gathering and comparison of the ground truth information with the belief, which provides an additional channel of feedback for the learning. Compared to the sporadic and delayed feedback coming from the reward in RL, the feedback from the belief map is more consistent and reliable. Agents using BAMS can learn a more effective message passing network to better understand each other, resulting in better performance in a cooperative predator and prey game with varying levels of map complexity and compare it to previous multi-agent message passing models. The simulation results showed that BAMS reduced training epochs by 66\%, and agents who apply the BAMS model completed the game with 34.62\% fewer steps on average.

What problem does this paper attempt to address?

The paper attempts to address the problem of how to optimize the message-passing mechanism among agents in a multi-agent system to improve decision-making efficiency and collaboration in cooperative games. Specifically, the paper focuses on when agents should send messages, how to encode messages, and how to utilize received messages in multi-agent reinforcement learning (MARL). These factors directly affect the effectiveness of collaboration among agents. The optimization of the message-passing system needs to be synchronized with the agents' strategies, which increases the complexity of the model and poses challenges to the convergence and performance of learning. To address these issues, the authors propose a Belief-map Assisted Multi-agent System (BAMS). BAMS enhances the training process by introducing a neural-symbolic belief map that can decode the hidden states of agents, providing a symbolic representation of the agents' understanding of the environment and other agents' states. The simplicity of this symbolic representation allows for the comparison of real information with beliefs, providing additional feedback channels for learning. Compared to rewards in reinforcement learning, feedback from the belief map is more consistent and reliable, helping agents learn more effective message-passing networks and better understand each other, thus performing better in games. The main contributions of the paper include: 1. Proposing a belief-map assisted training mechanism that supplements reinforcement learning with supervised information, accelerating training convergence. 2. Designing a belief map decoder that reconstructs a neural-symbolic map from environmental embeddings, providing additional feedback for training. This map translates the hidden states of agents into a human-readable format, significantly improving the interpretability of the agents' decision-making process. 3. Training agents using the BAMS model to communicate more effectively, capture prey faster, and be less sensitive to external interference as the number of agents increases. 4. Simulation results show that agents with these improvements can be effectively trained in large and complex environments, reducing training time by an average of 66% and improving overall performance by 34.62%.

Multi-agent Cooperative Games Using Belief Map Assisted Training

Learning to Cooperate: Application of Deep Reinforcement Learning for Online AGV Path Finding.

Adaptive algorithm for multi-agent learning optimal cooperative pursuit strategy based on Markov game

Moving Forward in Formation: A Decentralized Hierarchical Learning Approach to Multi-Agent Moving Together

Optimal Exploration Algorithm of Multi-Agent Reinforcement Learning Methods (Student Abstract)

Cooperative multi-agent target searching: a deep reinforcement learning approach based on parallel hindsight experience replay

Neural Recursive Belief States in Multi-Agent Reinforcement Learning

Learning Efficient Multi-Agent Cooperative Visual Exploration

Multi-Agent Evolutionary Reinforcement Learning Based on Cooperative Games

LMRL: a Multi-Agent Reinforcement Learning Model and Algorithm

A new multi-agent reinforcement learning approach

Cooperative Learning of Multi-Agent Systems Via Reinforcement Learning

On Multi-Agent Learning in Team Sports Games

Fact-based Agent modeling for Multi-Agent Reinforcement Learning

ACE: Cooperative Multi-agent Q-learning with Bidirectional Action-Dependency

Deep reinforcement learning algorithm based on multi-agent parallelism and its application in game environment

Relation-Aware Learning for Multi-Task Multi-Agent Cooperative Games

Learning Meta Representations for Agents in Multi-Agent Reinforcement Learning

Consciousness-Aware Multi-Agent Reinforcement Learning

Multi-Robot Cooperative Socially-Aware Navigation Using Multi-Agent Reinforcement Learning

Hierarchical and Stable Multiagent Reinforcement Learning for Cooperative Navigation Control