Abstract:Innate values describe agents' intrinsic motivations, which reflect their inherent interests and preferences to pursue goals and drive them to develop diverse skills satisfying their various needs. The essence of reinforcement learning (RL) is learning from interaction based on reward-driven (such as utilities) behaviors, much like natural agents. It is an excellent model to describe the innate-values-driven (IV) behaviors of AI agents. Especially in multi-agent systems (MAS), building the awareness of AI agents to balance the group utilities and system costs and satisfy group members' needs in their cooperation is a crucial problem for individuals learning to support their community and integrate human society in the long term. This paper proposes a hierarchical compound intrinsic value reinforcement learning model -- innate-values-driven reinforcement learning termed IVRL to describe the complex behaviors of multi-agent interaction in their cooperation. We implement the IVRL architecture in the StarCraft Multi-Agent Challenge (SMAC) environment and compare the cooperative performance within three characteristics of innate value agents (Coward, Neutral, and Reckless) through three benchmark multi-agent RL algorithms: QMIX, IQL, and QTRAN. The results demonstrate that by organizing individual various needs rationally, the group can achieve better performance with lower costs effectively.

What problem does this paper attempt to address?

This paper proposes a model called Innate-Values-driven Reinforcement Learning (IVRL) to address the problem of complex behavioral description in cooperative multi-agent systems. In natural systems, innate values describe the inherent motivation of organisms, driving them to pursue goals and develop diverse skills to meet different needs. Reinforcement learning (RL) is an excellent model for learning through reward-based behavioral interactions, which can simulate this innate values-driven behavior. In multi-agent systems (MAS), it is crucial to build agents' awareness of collective utility, system costs, and member needs in order to balance these factors in cooperation and support long-term community and integration into human society. The IVRL model addresses this problem through hierarchical composite innate value reinforcement learning. The paper implements the IVRL architecture in the StarCraft Multi-Agent Challenge (SMAC) environment and compares the cooperative performance of different agents with innate values using three multi-agent RL algorithms (QMIX, IQL, QTRAN). The results show that by rationalizing individual needs, the group can achieve better performance at lower costs. The paper also discusses how to combine incentive theory with innate values to describe complex behaviors in multi-agent interactions. Through simulation experiments, the paper demonstrates the performance differences of agents with different innate value characteristics (such as timid, neutral, and adventurous) in cooperative tasks, proving the effectiveness of the IVRL model. In summary, the paper attempts to address how to improve the cooperative performance of multi-agent systems using innate values-driven reinforcement learning, enabling individuals to develop strategies that adapt to complex environments based on their inherent needs and preferences.

Innate-Values-driven Reinforcement Learning for Cooperative Multi-Agent Systems

Rationality based Innate-Values-driven Reinforcement Learning

S2rl

S2RL: Do We Really Need to Perceive All States in Deep Multi-Agent Reinforcement Learning?

S2RL: DoWe Really Need to Perceive All States in Deep Multi-Agent Reinforcement Learning?

LIIR: Learning Individual Intrinsic Reward in Multi-Agent Reinforcement Learning.

Learning to Incentivize Other Learning Agents

Modeling the Interaction Between Agents in Cooperative Multi-Agent Reinforcement Learning

Peer Incentive Reinforcement Learning for Cooperative Multiagent Games

Qatten: A General Framework for Cooperative Multiagent Reinforcement Learning

Modeling Moral Choices in Social Dilemmas with Multi-Agent Reinforcement Learning

Intrinsic Reward with Peer Incentives for Cooperative Multi-Agent Reinforcement Learning.

Commander-Soldiers Reinforcement Learning for Cooperative Multi-Agent Systems

A new multi-agent reinforcement learning approach

Coordinated Exploration via Intrinsic Rewards for Multi-Agent Reinforcement Learning

Hierarchical Cooperative Multi-Agent Reinforcement Learning with Skill Discovery

Multi-Agent Evolutionary Reinforcement Learning Based on Cooperative Games

Priority over Quantity: A Self-Incentive Credit Assignment Scheme for Cooperative Multiagent Reinforcement Learning

Situation-Dependent Causal Influence-Based Cooperative Multi-agent Reinforcement Learning

DCIR: Dynamic Consistency Intrinsic Reward for Multi-Agent Reinforcement Learning

Individual Reward Assisted Multi-Agent Reinforcement Learning.