Abstract:Multi-agent systems (MAS) are widely prevalent and crucially important in numerous real-world applications, where multiple agents must make decisions to achieve their objectives in a shared environment. Despite their ubiquity, the development of intelligent decision-making agents in MAS poses several open challenges to their effective implementation. This survey examines these challenges, placing an emphasis on studying seminal concepts from game theory (GT) and machine learning (ML) and connecting them to recent advancements in multi-agent reinforcement learning (MARL), i.e. the research of data-driven decision-making within MAS. Therefore, the objective of this survey is to provide a comprehensive perspective along the various dimensions of MARL, shedding light on the unique opportunities that are presented in MARL applications while highlighting the inherent challenges that accompany this potential. Therefore, we hope that our work will not only contribute to the field by analyzing the current landscape of MARL but also motivate future directions with insights for deeper integration of concepts from related domains of GT and ML. With this in mind, this work delves into a detailed exploration of recent and past efforts of MARL and its related fields and describes prior solutions that were proposed and their limitations, as well as their applications.
What problem does this paper attempt to address?
This paper attempts to address several open challenges in the development of intelligent decision - making agents in multi - agent systems (MAS). Specifically, these challenges are related to data - driven decision - making in multi - agent reinforcement learning (MARL). By studying the core concepts in game theory (GT) and machine learning (ML) and connecting them with the latest progress in MARL, the paper aims to provide a comprehensive perspective to understand the characteristics of MARL in all dimensions. This includes revealing the unique opportunities in MARL applications while highlighting the inherent challenges that come with these potentials. Therefore, this survey not only analyzes the current state of the MARL field but also hopes to provide inspiration for future research directions through in - depth integration of concepts in related fields.
### Main Objectives
- **Provide a Comprehensive Perspective**: The paper aims to provide a comprehensive understanding of MARL from multiple angles, covering its unique opportunities and challenges.
- **Promote the Development of the Field**: By analyzing current research achievements and limitations, as well as practical applications, to promote the further development of the MARL field.
- **Inspire Future Research**: Propose future research directions, especially in terms of combining the concepts of game theory and machine learning, and provide insights for deeper integration.
### Core Questions
- **Decision - making Challenges in Multi - agent Environments**: How to achieve effective decision - making in multi - agent environments?
- **Applications of Game Theory and Machine Learning**: How to use the core concepts of game theory and machine learning to solve problems in MARL?
- **Complexity of the Real World**: How to incorporate the complexity in real - world MAS applications into MARL solutions?
### Structural Overview
- **Section 3**: Define the optimal control learning problem in MAS and discuss the core concepts in the basic fields of MARL, such as game theory and machine learning.
- **Section 4**: Explore the unique advantages and challenges of learning in MAS and analyze the learning pathologies in the MARL paradigm.
- **Section 5**: Study the prospects of MARL, including specific simulations, training paradigms, communication methods, multi - agent credit assignment, ad - hoc teamwork, social learning and agent modeling, etc., and discuss in detail the relevant latest efforts.
### Formula Examples
- **State Transition Function**:
\[
T(s, a, s') = P(s_{t + 1}=s'\mid s_t = s, a_t = a)\quad(1)
\]
- **Reward Function**:
\[
G(i, t)(\tau)=\sum_{t' = t}^{T}\gamma^{t'}r_i(s_{t'}, a_{i, t'})\quad(4)
\]
- **Value Function**:
\[
V^{\pi}_i(s_t\mid\pi)=\mathbb{E}_{\tau\sim p_{\pi}(\tau\mid s_t)}[G(i, t)(\tau)]\quad(5)
\]
- **Q - value Function**:
\[
Q^{\pi}_i(s_t, a_{i, t}\mid\pi)=\mathbb{E}_{\tau\sim p_{\pi}(\tau\mid s_t, a_{i, t})}[G(i, t)(\tau)]\quad(6)
\]
- **Advantage Function**:
\[
A^{\pi}_i(s_t, a_{i, t}\mid\pi)=Q^{\pi}_i(s_t, a_{i, t}\mid\pi)-V^{\pi}_i(s_t\mid\pi)\quad(8)
\]
Through these formulas and concepts, the paper aims to provide readers with a comprehensive and in - depth understanding to help them conduct more effective research and applications in the field of multi - agent reinforcement learning.