Abstract:Understanding the intentions and beliefs of others, a phenomenon known as "theory of mind", is a crucial element in social behavior. These beliefs and perceptions are inherently subjective and latent, making them often unobservable for investigation. Social interactions further complicate the matter, as multiple agents can engage in recursive reasoning about each other's strategies with increasing levels of cognitive hierarchy. While previous research has shown promise in understanding a single agent's belief of values through inverse reinforcement learning, extending this to model interactions among multiple agents remains an open challenge due to the computational complexity. In this work, we adopted a probabilistic recursive modeling of cognitive levels and joint value decomposition to achieve efficient multi-agent inverse reinforcement learning (MAIRL). We validated our method using simulations of a cooperative foraging task. Our algorithm revealed both the ground truth goal-directed value function and agents' beliefs about their counter-parts' strategies. When applied to human behavior in a cooperative hallway task, our method identified meaningful goal maps that evolved with task proficiency and an interaction map that is related to key states in the task without accessing to the task rules. Similarly, in a non-cooperative task performed by monkeys, we identified mutual predictions that correlated with the animals' social hierarchy, highlighting the behavioral relevance of the latent beliefs we uncovered. Together, our findings demonstrate that MAIRL offers a new framework for uncovering human or animal beliefs in social behavior, thereby illuminating previously opaque aspects of social cognition.

Competitive Multi-agent Deep Reinforcement Learning with Counterfactual Thinking

Optimal Exploration Algorithm of Multi-Agent Reinforcement Learning Methods (Student Abstract)

Toward a Psychology of Deep Reinforcement Learning Agents Using a Cognitive Architecture

Counter-Factual Reinforcement Learning: How to Model Decision-Makers That Anticipate The Future

From mimic to counteract: a two-stage reinforcement learning algorithm for Google research football

Inducing Cooperation via Team Regret Minimization based Multi-Agent Deep Reinforcement Learning

Theory of Mind as Intrinsic Motivation for Multi-Agent Reinforcement Learning

Episodic Future Thinking Mechanism for Multi-agent Reinforcement Learning

Decentralized Counterfactual Value with Threat Detection for Multi-Agent Reinforcement Learning in Mixed Cooperative and Competitive Environments

Algorithms in Multi-Agent Systems: A Holistic Perspective from Reinforcement Learning and Game Theory

Modeling Theory of Mind in Multi-Agent Games Using Adaptive Feedback Control

Pangu-Agent: A Fine-Tunable Generalist Agent with Structured Reasoning

Improving Agent Decision Payoffs via a New Framework of Opponent Modeling

Counterfactual State Explanations for Reinforcement Learning Agents via Generative Deep Learning

Multiagent Inverse Reinforcement Learning via Theory of Mind Reasoning

Situation-Dependent Causal Influence-Based Cooperative Multi-agent Reinforcement Learning

Unveiling the latent dynamics in social cognition with multi-agent inverse reinforcement learning

Hierarchical Deep Reinforcement Learning Agent with Counter Self-play on Competitive Games

RLCFR: Minimize counterfactual regret by deep reinforcement learning

Fast Adaptation to External Agents Via Meta Imitation Counterfactual Regret Advantage.

Deep multiagent reinforcement learning: challenges and directions