Abstract:One of the challenges for multiagent reinforcement learning (MARL) is designing efficient learning algorithms for a large system in which each agent has only limited or partial information of the entire system. Whereas exciting progress has been made to analyze decentralized MARL with the network of agents for social networks and team video games, little is known theoretically for decentralized MARL with the network of states for modeling self-driving vehicles, ride-sharing, and data and traffic routing. This paper proposes a framework of localized training and decentralized execution to study MARL with the network of states. Localized training means that agents only need to collect local information in their neighboring states during the training phase; decentralized execution implies that agents can execute afterward the learned decentralized policies, which depend only on agents’ current states. The theoretical analysis consists of three key components: the first is the reformulation of the MARL system as a networked Markov decision process with teams of agents, enabling updating the associated team Q-function in a localized fashion; the second is the Bellman equation for the value function and the appropriate Q-function on the probability measure space; and the third is the exponential decay property of the team Q-function, facilitating its approximation with efficient sample efficiency and controllable error. The theoretical analysis paves the way for a new algorithm LTDE-Neural-AC, in which the actor–critic approach with overparameterized neural networks is proposed. The convergence and sample complexity are established and shown to be scalable with respect to the sizes of both agents and states. To the best of our knowledge, this is the first neural network–based MARL algorithm with network structure and provable convergence guarantee. Funding: X. Wei is partially supported by NSFC no. 12201343. R. Xu is partially supported by the NSF CAREER award DMS-2339240.

S2RL: Do We Really Need to Perceive All States in Deep Multi-Agent Reinforcement Learning?

S2RL: DoWe Really Need to Perceive All States in Deep Multi-Agent Reinforcement Learning?

S2rl

Multiagent Reinforcement Learning for Strictly Constrained Tasks Based on Reward Recorder

Attentive Relational State Representation in Decentralized Multiagent Reinforcement Learning.

Is Centralized Training with Decentralized Execution Framework Centralized Enough for MARL?

MARL-LNS: Cooperative Multi-agent Reinforcement Learning via Large Neighborhoods Search

Boosting Value Decomposition Via Unit-Wise Attentive State Representation for Cooperative Multi-Agent Reinforcement Learning

SC-MAIRL: Semi-Centralized Multi-Agent Imitation Reinforcement Learning

LIIR: Learning Individual Intrinsic Reward in Multi-Agent Reinforcement Learning.

Decentralized multi-agent reinforcement learning based on best-response policies

Decentralized Multi-Agent Reinforcement Learning: An Off-Policy Method

Attention Based Large Scale Multi-agent Reinforcement Learning

YOLO-MARL: You Only LLM Once for Multi-agent Reinforcement Learning

Hybrid Information-driven Multi-agent Reinforcement Learning

Attention-Guided Contrastive Role Representations for Multi-Agent Reinforcement Learning

Mean-Field Multiagent Reinforcement Learning: A Decentralized Network Approach

Qatten: A General Framework for Cooperative Multiagent Reinforcement Learning

Mean-Field Multi-Agent Reinforcement Learning: A Decentralized Network Approach

From Centralized to Self-Supervised: Pursuing Realistic Multi-Agent Reinforcement Learning

Attention-Driven Multi-Agent Reinforcement Learning: Enhancing Decisions with Expertise-Informed Tasks