Abstract:Abstract Multi-agent multi-target search strategies can be utilized in complex scenarios such as post-disaster search and rescue by unmanned aerial vehicles. To solve the problem of fixed target and trajectory, the current multi-agent multi-target search strategies are mainly based on deep reinforcement learning (DRL). However, the training of agents by the DRL tend to be brittle due to their sensitivity to the training environment, which makes the strategies learned by the agents fall into local optima frequently, resulting in poor system robustness. Additionally, sparse rewards in DRL will lead to the problems such as difficulty in system convergence and low utilization efficiency of the sampled data. To address the problem that the robustness of the agents is weakened and the sparse rewards exist in the multi-objective search environment, we propose a MiniMax Multi-agent Deep Deterministic Policy Gradient based on the Parallel Hindsight Experience Replay (PHER-M3DDPG) algorithm, which adopts the framework of centralized training and decentralized execution in continuous action space. To enhance the system robustness, the PHER-M3DDPG algorithm employs a minimax learning architecture, which adaptively adjusts the learning strategy of agents by involving adversarial disturbances. In addition, to solve the sparse rewards problem, the PHER-M3DDPG algorithm adopts a parallel hindsight experience replay mechanism to increase the efficiency of data utilization by involving virtual learning targets and batch processing of the sampled data. Simulation results show that the PHER-M3DDPG algorithm outperforms the existing algorithms in terms of convergence speed and the task completion time in a multi-target search environment.

Scalable-MADDPG-Based Cooperative Target Invasion for a Multi-USV System

Dynamic Navigation and Area Assignment of Multiple USVs Based on Multi-Agent Deep Reinforcement Learning

Multi-USV System Cooperative Underwater Target Search Based on Reinforcement Learning and Probability Map

Deep Reinforcement Learning Based Multi-UUV Cooperative Control for Target Capturing

Multi-USV Dynamic Navigation and Target Capture: A Guided Multi-Agent Reinforcement Learning Approach

Cooperative multi-agent target searching: a deep reinforcement learning approach based on parallel hindsight experience replay

Dynamic Obstacle Avoidance for USVs Using Cross-Domain Deep Reinforcement Learning and Neural Network Model Predictive Controller

Sim-real joint experimental verification for an unmanned surface vehicle formation strategy based on multi-agent deterministic policy gradient and line of sight guidance

Multi-Target Pursuit by a Decentralized Heterogeneous UAV Swarm using Deep Multi-Agent Reinforcement Learning

DRL-based target interception strategy design for an underactuated USV without obstacle collision

Multiple Ships Cooperative Navigation and Collision Avoidance using Multi-agent Reinforcement Learning with Communication

Multiple targets traversing for unmanned surface vehicles by bundled genetic optimization and fast-marching Q-Learning

Safe deep reinforcement learning-based adaptive control for USV interception mission

Multi-USV Formation Collision Avoidance via Deep Reinforcement Learning and COLREGs

Maximizing UAV Coverage in Maritime Wireless Networks: A Multiagent Reinforcement Learning Approach

Control and Coordination of a SWARM of Unmanned Surface Vehicles using Deep Reinforcement Learning in ROS

Multi-UAV Pursuit-Evasion with Online Planning in Unknown Environments by Deep Reinforcement Learning

Secure and Cooperative Target Tracking Via AUV Swarm - A Reinforcement Learning Approach.

Reinforcement Learning Based Obstacle Avoidance for AUV Swarm in Dynamic Ocean Environment

Collision avoidance decision-making strategy for multiple USVs based on Deep Reinforcement Learning algorithm

Obstacle avoidance USV in multi-static obstacle environments based on a deep reinforcement learning approach