Abstract:Abstract Multi-agent multi-target search strategies can be utilized in complex scenarios such as post-disaster search and rescue by unmanned aerial vehicles. To solve the problem of fixed target and trajectory, the current multi-agent multi-target search strategies are mainly based on deep reinforcement learning (DRL). However, the training of agents by the DRL tend to be brittle due to their sensitivity to the training environment, which makes the strategies learned by the agents fall into local optima frequently, resulting in poor system robustness. Additionally, sparse rewards in DRL will lead to the problems such as difficulty in system convergence and low utilization efficiency of the sampled data. To address the problem that the robustness of the agents is weakened and the sparse rewards exist in the multi-objective search environment, we propose a MiniMax Multi-agent Deep Deterministic Policy Gradient based on the Parallel Hindsight Experience Replay (PHER-M3DDPG) algorithm, which adopts the framework of centralized training and decentralized execution in continuous action space. To enhance the system robustness, the PHER-M3DDPG algorithm employs a minimax learning architecture, which adaptively adjusts the learning strategy of agents by involving adversarial disturbances. In addition, to solve the sparse rewards problem, the PHER-M3DDPG algorithm adopts a parallel hindsight experience replay mechanism to increase the efficiency of data utilization by involving virtual learning targets and batch processing of the sampled data. Simulation results show that the PHER-M3DDPG algorithm outperforms the existing algorithms in terms of convergence speed and the task completion time in a multi-target search environment.

Learning Efficient Multi-Agent Cooperative Visual Exploration

Learning to Cooperate: Application of Deep Reinforcement Learning for Online AGV Path Finding.

MASP: Scalable GNN-based Planning for Multi-Agent Navigation

SAVE: Spatial-Attention Visual Exploration.

Active Neural Topological Mapping for Multi-Agent Exploration

Attention-Cooperated Reinforcement Learning for Multi-agent Path Planning

Optimal Exploration Algorithm of Multi-Agent Reinforcement Learning Methods (Student Abstract)

Cooperative multi-agent target searching: a deep reinforcement learning approach based on parallel hindsight experience replay

Self-Motivated Multi-Agent Exploration

Multi-robot Social-aware Cooperative Planning in Pedestrian Environments Using Multi-agent Reinforcement Learning

Edge-conditioned vector basis functions for the analysis and optimization of rectangular waveguide dual-mode filters

Collaborative Visual Navigation

Learning to Act with Affordance-Aware Multimodal Neural SLAM

Multi-Robot Cooperative Socially-Aware Navigation Using Multi-Agent Reinforcement Learning

Efficient Multi-agent Cooperative Navigation in Unknown Environments with Interlaced Deep Reinforcement Learning.

MAexp: A Generic Platform for RL-based Multi-Agent Exploration

Teaching Agents how to Map: Spatial Reasoning for Multi-Object Navigation

Multi-Agent Exploration of an Unknown Sparse Landmark Complex via Deep Reinforcement Learning

Learning and Planning with a Semantic Model

Multi-Agent Embodied Visual Semantic Navigation with Scene Prior Knowledge

Multi-Object Navigation in real environments using hybrid policies