Innate-Values-driven Reinforcement Learning for Cooperative Multi-Agent Systems

Qin Yang
2024-01-11
Abstract:Innate values describe agents' intrinsic motivations, which reflect their inherent interests and preferences to pursue goals and drive them to develop diverse skills satisfying their various needs. The essence of reinforcement learning (RL) is learning from interaction based on reward-driven (such as utilities) behaviors, much like natural agents. It is an excellent model to describe the innate-values-driven (IV) behaviors of AI agents. Especially in multi-agent systems (MAS), building the awareness of AI agents to balance the group utilities and system costs and satisfy group members' needs in their cooperation is a crucial problem for individuals learning to support their community and integrate human society in the long term. This paper proposes a hierarchical compound intrinsic value reinforcement learning model -- innate-values-driven reinforcement learning termed IVRL to describe the complex behaviors of multi-agent interaction in their cooperation. We implement the IVRL architecture in the StarCraft Multi-Agent Challenge (SMAC) environment and compare the cooperative performance within three characteristics of innate value agents (Coward, Neutral, and Reckless) through three benchmark multi-agent RL algorithms: QMIX, IQL, and QTRAN. The results demonstrate that by organizing individual various needs rationally, the group can achieve better performance with lower costs effectively.
Machine Learning,Artificial Intelligence,Multiagent Systems,Robotics
What problem does this paper attempt to address?
This paper proposes a model called Innate-Values-driven Reinforcement Learning (IVRL) to address the problem of complex behavioral description in cooperative multi-agent systems. In natural systems, innate values describe the inherent motivation of organisms, driving them to pursue goals and develop diverse skills to meet different needs. Reinforcement learning (RL) is an excellent model for learning through reward-based behavioral interactions, which can simulate this innate values-driven behavior. In multi-agent systems (MAS), it is crucial to build agents' awareness of collective utility, system costs, and member needs in order to balance these factors in cooperation and support long-term community and integration into human society. The IVRL model addresses this problem through hierarchical composite innate value reinforcement learning. The paper implements the IVRL architecture in the StarCraft Multi-Agent Challenge (SMAC) environment and compares the cooperative performance of different agents with innate values using three multi-agent RL algorithms (QMIX, IQL, QTRAN). The results show that by rationalizing individual needs, the group can achieve better performance at lower costs. The paper also discusses how to combine incentive theory with innate values to describe complex behaviors in multi-agent interactions. Through simulation experiments, the paper demonstrates the performance differences of agents with different innate value characteristics (such as timid, neutral, and adventurous) in cooperative tasks, proving the effectiveness of the IVRL model. In summary, the paper attempts to address how to improve the cooperative performance of multi-agent systems using innate values-driven reinforcement learning, enabling individuals to develop strategies that adapt to complex environments based on their inherent needs and preferences.