Abstract:In multi-agent systems, agents need to interact and collaborate with other agents in environments. Agent modeling is crucial to facilitate agent interactions and make adaptive cooperation strategies. However, it is challenging for agents to model the beliefs, behaviors, and intentions of other agents in non-stationary environment where all agent policies are learned simultaneously. In addition, the existing methods realize agent modeling through behavior cloning which assume that the local information of other agents can be accessed during execution or training. However, this assumption is infeasible in unknown scenarios characterized by unknown agents, such as competition teams, unreliable communication and federated learning due to privacy concerns. To eliminate this assumption and achieve agent modeling in unknown scenarios, Fact-based Agent modeling (FAM) method is proposed in which fact-based belief inference (FBI) network models other agents in partially observable environment only based on its local information. The reward and observation obtained by agents after taking actions are called facts, and FAM uses facts as reconstruction target to learn the policy representation of other agents through a variational autoencoder. We evaluate FAM on various Multiagent Particle Environment (MPE) and compare the results with several state-of-the-art MARL algorithms. Experimental results show that compared with baseline methods, FAM can effectively improve the efficiency of agent policy learning by making adaptive cooperation strategies in multi-agent reinforcement learning tasks, while achieving higher returns in complex competitive-cooperative mixed scenarios.

What problem does this paper attempt to address?

The paper primarily addresses the issue of agent modeling in multi-agent reinforcement learning (MARL) by proposing a Fact-based Agent Modeling (FAM) method. Specifically, the paper aims to solve the following key problems: 1. **Eliminating Traditional Assumptions**: In multi-agent systems, agents need to interact and cooperate with other agents in a non-stationary environment. Traditional agent modeling methods often assume access to other agents' local information (such as observations and actions), but this assumption does not hold in many unknown scenarios, such as competitive teams, unreliable communication, or federated learning scenarios due to privacy considerations. 2. **Agent Modeling in Non-Stationary Environments**: In a non-stationary environment, all agents' policies are being learned simultaneously, making the environment unstable for each agent and difficult to treat other agents as part of the environment directly. 3. **Agent Modeling in Partially Observable Environments**: Considering that in practical applications, agents may only obtain their local information, the paper proposes a new agent modeling method, namely Fact-based Belief Inference (FBI) network, which models other agents based solely on its local information. To achieve the above goals, the main contributions of the paper are as follows: 1. **Proposing the FBI Network**: To eliminate the traditional assumption of accessing other agents' local information, the paper proposes an FBI network based on a variational autoencoder (VAE) to infer the policy representation of other agents based on its local information (i.e., observations, actions, and rewards). 2. **Combining FBI with Actor-Critic Framework**: By combining the FBI network with the Actor-Critic algorithm, the paper proposes the FAM method, enabling agents to learn adaptive cooperative policies by considering other agents' policies, and it is suitable for partially observable environments. 3. **Experimental Validation**: The paper validates the effectiveness and feasibility of FAM through experiments on multiple multi-agent particle environments (MPE) and analyzes the information encoded by the FBI network. In summary, the paper proposes a novel agent modeling method, FAM, aiming to address the key challenges of agent modeling in multi-agent reinforcement learning, particularly the effective interaction and cooperation among agents in unknown and partially observable environments.

Fact-based Agent modeling for Multi-Agent Reinforcement Learning

Multi-agent Cooperative Games Using Belief Map Assisted Training

Towards Efficient Collaboration via Graph Modeling in Reinforcement Learning

Optimal Exploration Algorithm of Multi-Agent Reinforcement Learning Methods (Student Abstract)

Multi-Agent Incentive Communication via Decentralized Teammate Modeling

More Centralized Training, Still Decentralized Execution: Multi-Agent Conditional Policy Factorization

Contrastive learning-based agent modeling for deep reinforcement learning

PAC: Assisted Value Factorisation with Counterfactual Predictions in Multi-Agent Reinforcement Learning

Hierarchical relationship modeling in multi-agent reinforcement learning for mixed cooperative–competitive environments

Agent Modeling as Auxiliary Task for Deep Reinforcement Learning

Reaching Consensus in Cooperative Multi-Agent Reinforcement Learning with Goal Imagination

Metric Policy Representations for Opponent Modeling

LMRL: a Multi-Agent Reinforcement Learning Model and Algorithm

Partially Observable Mean Field Multi-Agent Reinforcement Learning Based on Graph-Attention

Improving Sample Efficiency of Multiagent Reinforcement Learning with Nonexpert Policy for Flocking Control.

Inducing Cooperation via Team Regret Minimization based Multi-Agent Deep Reinforcement Learning

CVAE-based Far-sighted Intention Inference for Opponent Modeling in Multi-agent Reinforcement Learning

Weighted Mean-Field Multi-Agent Reinforcement Learning via Reward Attribution Decomposition

GAT-MF: Graph Attention Mean Field for Very Large Scale Multi-Agent Reinforcement Learning

Modeling the Interaction Between Agents in Cooperative Multi-Agent Reinforcement Learning

Modeling and reinforcement learning in partially observable many-agent systems