Abstract:In this paper, we explore how to optimize task allocation for robot swarms in dynamic environments, emphasizing the necessity of formulating robust, flexible, and scalable strategies for robot cooperation. We introduce a novel framework using a decentralized partially observable Markov decision process (Dec_POMDP), specifically designed for distributed robot swarm networks. At the core of our methodology is the Local Information Aggregation Multi-Agent Deep Deterministic Policy Gradient (LIA_MADDPG) algorithm, which merges centralized training with distributed execution (CTDE). During the centralized training phase, a local information aggregation (LIA) module is meticulously designed to gather critical data from neighboring robots, enhancing decision-making efficiency. In the distributed execution phase, a strategy improvement method is proposed to dynamically adjust task allocation based on changing and partially observable environmental conditions. Our empirical evaluations show that the LIA module can be seamlessly integrated into various CTDE-based MARL methods, significantly enhancing their performance. Additionally, by comparing LIA_MADDPG with six conventional reinforcement learning algorithms and a heuristic algorithm, we demonstrate its superior scalability, rapid adaptation to environmental changes, and ability to maintain both stability and convergence speed. These results underscore LIA_MADDPG's outstanding performance and its potential to significantly improve dynamic task allocation in robot swarms through enhanced local collaboration and adaptive strategy execution.

A dynamic mission abort policy for the swarm executing missions and its solution method by tailored deep reinforcement learning

Hierarchical Decision and Control for Continuous Multitarget Problem: Policy Evaluation with Action Delay

Multi-Agent Reinforcement Learning Based UAV Swarm Communications Against Jamming

Model-free Maneuvering Control of Fixed-Wing UAVs Based on Deep Reinforcement Learning

Spacecraft Attitude Maneuver Planning Based on Deep Reinforcement Learning under Complex Constraints

Failure risk management: adaptive performance control and mission abort decisions

Mission risk control via joint optimization of sampling and abort decisions

Mission Aborting Policies and Multiattempt Missions

Task Assignment for UAV Swarm Saturation Attack: A Deep Reinforcement Learning Approach

An Effective and Scalable Approach for Swarm-on-Swarm Air Combat Decision

UAV Swarm Confrontation Using Hierarchical Multiagent Reinforcement Learning

A Local Information Aggregation based Multi-Agent Reinforcement Learning for Robot Swarm Dynamic Task Allocation

Activation delay and aborting policy minimizing expected losses in consecutive attempts having cumulative effect on mission success

MW-MADDPG: a meta-learning based decision-making method for collaborative UAV swarm

Multi-UAV Pursuit-Evasion with Online Planning in Unknown Environments by Deep Reinforcement Learning

Joint Communication and Action Learning in Multi-Target Tracking of UAV Swarms with Deep Reinforcement Learning

Deep Reinforcement Learning for UAV Intelligent Mission Planning

Responsive Regulation of Dynamic UAV Communication Networks Based on Deep Reinforcement Learning

Risk Control of Mission-Critical Systems:Abort Decision-Makings Integrating Health and Age Conditions

Hierarchical Reinforcement Learning for Swarm Confrontation with High Uncertainty

Optimal Mission Abort Policy for Systems Operating in a Random Environment