Abstract:Among the research topics in multi-agent learning, mixed-motive cooperation is one of the most prominent challenges, primarily due to the mismatch between individual and collective goals. The cutting-edge research is focused on incorporating domain knowledge into rewards and introducing additional mechanisms to incentivize cooperation. However, these approaches often face shortcomings such as the effort on manual design and the absence of theoretical groundings. To close this gap, we model the mixed-motive game as a differentiable game for the ease of illuminating the learning dynamics towards cooperation. More detailed, we introduce a novel optimization method named \textbf{\textit{A}}ltruistic \textbf{\textit{G}}radient \textbf{\textit{A}}djustment (\textbf{\textit{AgA}}) that employs gradient adjustments to progressively align individual and collective objectives. Furthermore, we theoretically prove that AgA effectively attracts gradients to stable fixed points of the collective objective while considering individual interests, and we validate these claims with empirical evidence. We evaluate the effectiveness of our algorithm AgA through benchmark environments for testing mixed-motive collaboration with small-scale agents such as the two-player public good game and the sequential social dilemma games, Cleanup and Harvest, as well as our self-developed large-scale environment in the game StarCraft II.

What problem does this paper attempt to address?

### What problems does this paper attempt to solve? This paper mainly explores the problem of mixed - motive cooperation in multi - agent cooperation. Specifically, the paper attempts to solve the following problems: 1. **Inconsistency between individual goals and collective goals**: - In a multi - agent system, each agent usually has its own individual goals, and these goals may not be completely consistent with the collective goals, or even conflict with them. This inconsistency is one of the main challenges in mixed - motive cooperation. 2. **Limitations of existing methods**: - Most of the current research relies on manually - designed mechanisms to promote cooperation, such as introducing reputation, norms and contract mechanisms, or adjusting individual and collective goals through intrinsic motivation. However, these methods often require a great deal of manual design and lack a theoretical basis. - In addition, many existing algorithms face challenges when dealing with Nash equilibria in non - convex games and it is difficult to find stable points. 3. **Combination of theory and practice**: - The paper aims to model mixed - motive games as differential games and propose a new optimization method - Altruistic Gradient Adjustment (AgA) to align individual and collective goals from the perspective of gradients. - It is theoretically proven that AgA can effectively guide the gradient towards the stable fixed points of the collective goal while considering individual interests. - Empirical research has verified the effectiveness of AgA and demonstrated its superior performance in different environments. ### Main contributions - **Modeling mixed - motive games as differential games for the first time**: A new framework DMG (Differentiable Mixed - Motive Game) is proposed for analyzing the learning dynamics at the individual and collective levels. - **Proposing the AgA algorithm**: Aligning individual and collective goals by modifying the gradient, and theoretically proving the effectiveness of AgA near the stable fixed points. - **Introducing the Selfish - MMM2 environment**: This is a new large - scale mixed - motive cooperation environment that supports larger - scale scenarios and more complex tasks, verifying the superior performance of the AgA algorithm. Through these contributions, the paper provides a new theoretical and practical method for solving the problem of mixed - motive cooperation in multi - agent systems.

Aligning Individual and Collective Objectives in Multi-Agent Cooperation

Aligning Individual and Collective Objectives in Multi-Agent Cooperation

Shapley Q-Value: A Local Reward Approach to Solve Global Reward Games

Evolutionary Game Theory Based Cooperation Algorithm in Multi-Agent System

An Evolutionary Cooperative Mechanism for Multi-agent System

Adaptive algorithm for multi-agent learning optimal cooperative pursuit strategy based on Markov game

Learning Intra-group Cooperation in Multi-agent Systems.

Engineering Optimal Cooperation Levels with Prosocial Autonomous Agents in Hybrid Human-Agent Populations: An Agent-Based Modeling Approach

Multi-agent cooperation through learning-aware policy gradients

Improved cooperation by balancing exploration and exploitation in intertemporal social dilemma tasks

Learning to Balance Altruism and Self-interest Based on Empathy in Mixed-Motive Games

Birds of a Feather Flock Together: A Close Look at Cooperation Emergence via Multi-Agent RL

Relation-Aware Learning for Multi-Task Multi-Agent Cooperative Games

CM3: Cooperative Multi-goal Multi-stage Multi-agent Reinforcement Learning

Toward Finding Strong Pareto Optimal Policies in Multi-Agent Reinforcement Learning

Adaptive Incentive Design with Multi-Agent Meta-Gradient Reinforcement Learning

Using a Stochastic Agent Model to Optimize Performance in Divergent Interest Tacit Coordination Games

Enhancing Human Experience in Human-Agent Collaboration: A Human-Centered Modeling Approach Based on Positive Human Gain

Learning Nudges for Conditional Cooperation: A Multi-Agent Reinforcement Learning Model

Evolutionary Game Dynamics of Multi-Agent Cooperation Driven by Self-Learning

Adaptive Value Decomposition with Greedy Marginal Contribution Computation for Cooperative Multi-Agent Reinforcement Learning