Abstract:Abstract Multi-agent multi-target search strategies can be utilized in complex scenarios such as post-disaster search and rescue by unmanned aerial vehicles. To solve the problem of fixed target and trajectory, the current multi-agent multi-target search strategies are mainly based on deep reinforcement learning (DRL). However, the training of agents by the DRL tend to be brittle due to their sensitivity to the training environment, which makes the strategies learned by the agents fall into local optima frequently, resulting in poor system robustness. Additionally, sparse rewards in DRL will lead to the problems such as difficulty in system convergence and low utilization efficiency of the sampled data. To address the problem that the robustness of the agents is weakened and the sparse rewards exist in the multi-objective search environment, we propose a MiniMax Multi-agent Deep Deterministic Policy Gradient based on the Parallel Hindsight Experience Replay (PHER-M3DDPG) algorithm, which adopts the framework of centralized training and decentralized execution in continuous action space. To enhance the system robustness, the PHER-M3DDPG algorithm employs a minimax learning architecture, which adaptively adjusts the learning strategy of agents by involving adversarial disturbances. In addition, to solve the sparse rewards problem, the PHER-M3DDPG algorithm adopts a parallel hindsight experience replay mechanism to increase the efficiency of data utilization by involving virtual learning targets and batch processing of the sampled data. Simulation results show that the PHER-M3DDPG algorithm outperforms the existing algorithms in terms of convergence speed and the task completion time in a multi-target search environment.

Hierarchical Heterogeneous Multi-Agent Cross-Domain Search Method Based on Deep Reinforcement Learning

Learning to Cooperate: Application of Deep Reinforcement Learning for Online AGV Path Finding.

Underwater Multi-agent Cooperative Formation Hunting Based on Deep Reinforcement Learning

Cooperative multi-agent target searching: a deep reinforcement learning approach based on parallel hindsight experience replay

Moving Forward in Formation: A Decentralized Hierarchical Learning Approach to Multi-Agent Moving Together

A Multi-AUV Maritime Target Search Method for Moving and Invisible Objects Based on Multi-Agent Deep Reinforcement Learning

UAV Swarm Cooperative Target Search: A Multi-Agent Reinforcement Learning Approach

Multi-AUV Cooperative Underwater Multi-Target Tracking Based on Dynamic-Switching-enabled Multi-Agent Reinforcement Learning

Hierarchical Reinforcement Learning Framework towards Multi-agent Navigation

HA-MARL: Heuristic and APF Assisted Multi-Agent Reinforcement Learning for Wireless Data Sharing in AUV Swarms

Target Search and Navigation in Heterogeneous Robot Systems with Deep Reinforcement Learning

Multi-Agent Reinforcement Learning Based Secure Searching and Data Collection in AUV Swarms.

Differential Game-Based Deep Reinforcement Learning in Underwater Target Hunting Task

Hierarchical Multi-Agent Reinforcement Learning for Air Combat Maneuvering

Multi-Agent Deep Reinforcement Learning Framework Strategized by Unmanned Aerial Vehicles for Multi-Vessel Full Communication Connection

Maximizing UAV Coverage in Maritime Wireless Networks: A Multiagent Reinforcement Learning Approach

Dynamic Target Assignment by Unmanned Surface Vehicles Based on Reinforcement Learning

Multi-USV Dynamic Navigation and Target Capture: A Guided Multi-Agent Reinforcement Learning Approach

HELSA: Hierarchical Reinforcement Learning with Spatiotemporal Abstraction for Large-Scale Multi-Agent Path Finding

Multi-AUV Assisted Seamless Underwater Target Tracking Relying on Deep Learning and Reinforcement Learning

Autonomous and Adaptive Role Selection for Multi-robot Collaborative Area Search Based on Deep Reinforcement Learning