Abstract:Intelligence agents and multi-agent systems play important roles in scenes like the control system of grouped drones, and multi-agent navigation and obstacle avoidance which is the foundational function of advanced application has great importance. In multi-agent navigation and obstacle avoidance tasks, the decision-making interactions and dynamic changes of agents are difficult for traditional route planning algorithms or reinforcement learning algorithms with the increased complexity of the environment. The classical multi-agent reinforcement learning algorithm, Multi-agent deep deterministic policy gradient(MADDPG), solved precedent algorithms' problems of having unstationary training process and unable to deal with environment randomness. However, MADDPG ignored the temporal message hidden beneath agents' interaction with the environment. Besides, due to its CTDE technique which let each agent's critic network to calculate over all agents' action and the whole environment information, it lacks ability to scale to larger amount of agents. To deal with MADDPG's ignorance of the temporal information of the data, this article proposes a new algorithm called MADDPG-LSTMactor, which combines MADDPG with Long short term memory (LSTM). By using agent's observations of continuous timesteps as the input of its policy network, it allows the LSTM layer to process the hidden temporal message. Experimental result demonstrated that this algorithm had better performance in scenarios where the amount of agents is small. Besides, to solve MADDPG's drawback of not being efficient in scenarios where agents are too many, this article puts forward a light-weight MADDPG (MADDPG-L) algorithm, which simplifies the input of critic network. The result of experiments showed that this algorithm had better performance than MADDPG when the amount of agents was large.

Multi-Uav Automatic Dynamic Obstacle Avoidance With Experience-Shared A2c

UAV Cooperative Search Based on Multi-agent Generative Adversarial Imitation Learning

Collaborative Decision-Making Method for Multi-UAV Based on Multiagent Reinforcement Learning

Dynamic Control Allocation between Onboard and Delayed Remote Control for Unmanned Aircraft System Detect-and-Avoid

Multi-UAV Autonomous Obstacle Avoidance Based on Reinforcement Learning

Multi-UAV Cooperative Air Combat Decision-Making Based on Multi-Agent Double-Soft Actor-Critic

Autonomous obstacle avoidance of UAV based on deep reinforcement learning

Game of Drones: Intelligent Online Decision Making of Multi-UAV Confrontation

The Design and Realization of Multi-agent Obstacle Avoidance based on Reinforcement Learning

[Development of specific immunotherapy technics in immediate hypersensitivity].

A Reinforcement Learning-based Decentralized Method of Avoiding Multi-UAV Collision in 3-D Airspace

Autonomous Decision Making for UAV Cooperative Pursuit-Evasion Game with Reinforcement Learning

Group-Based Deep Reinforcement Learning in Multi-UAV Confrontation

A Vision Based Deep Reinforcement Learning Algorithm for UAV Obstacle Avoidance

Research on Cooperative Obstacle Avoidance Decision Making of Unmanned Aerial Vehicle Swarms in Complex Environments under End-Edge-Cloud Collaboration Model

Multi-Agent Reinforcement Learning for Unmanned Aerial Vehicle Coordination by Multi-Critic Policy Gradient Optimization

Multi-Agent Reinforcement Learning-Based UAV Pathfinding for Obstacle Avoidance in Stochastic Environment

Enhanced Multi-UAV Path Planning in Complex Environments With Voronoi-Based Obstacle Modelling and Q-Learning

Improving multi-UAV cooperative path-finding through multiagent experience learning

Multi-UAV Speed Control with Collision Avoidance and Handover-aware Cell Association: DRL with Action Branching

UAV Swarm Confrontation Using Hierarchical Multiagent Reinforcement Learning