Abstract:Intelligence agents and multi-agent systems play important roles in scenes like the control system of grouped drones, and multi-agent navigation and obstacle avoidance which is the foundational function of advanced application has great importance. In multi-agent navigation and obstacle avoidance tasks, the decision-making interactions and dynamic changes of agents are difficult for traditional route planning algorithms or reinforcement learning algorithms with the increased complexity of the environment. The classical multi-agent reinforcement learning algorithm, Multi-agent deep deterministic policy gradient(MADDPG), solved precedent algorithms' problems of having unstationary training process and unable to deal with environment randomness. However, MADDPG ignored the temporal message hidden beneath agents' interaction with the environment. Besides, due to its CTDE technique which let each agent's critic network to calculate over all agents' action and the whole environment information, it lacks ability to scale to larger amount of agents. To deal with MADDPG's ignorance of the temporal information of the data, this article proposes a new algorithm called MADDPG-LSTMactor, which combines MADDPG with Long short term memory (LSTM). By using agent's observations of continuous timesteps as the input of its policy network, it allows the LSTM layer to process the hidden temporal message. Experimental result demonstrated that this algorithm had better performance in scenarios where the amount of agents is small. Besides, to solve MADDPG's drawback of not being efficient in scenarios where agents are too many, this article puts forward a light-weight MADDPG (MADDPG-L) algorithm, which simplifies the input of critic network. The result of experiments showed that this algorithm had better performance than MADDPG when the amount of agents was large.

HiSOMA: A hierarchical multi-agent model integrating self-organizing neural networks with multi-agent deep reinforcement learning

Hierarchical Task Network Planning for Facilitating Cooperative Multi-Agent Reinforcement Learning

Subgoal-based Hierarchical Reinforcement Learning for Multi-Agent Collaboration

Hierarchical Multi-Agent Reinforcement Learning for Air Combat Maneuvering

Semantically Aligned Task Decomposition in Multi-Agent Reinforcement Learning

Hierarchical Deep Multiagent Reinforcement Learning with Temporal Abstraction

Reinforcement learning for multi-agent formation navigation with scalability

Soft-HGRNs: Soft Hierarchical Graph Recurrent Networks for Multi-Agent Partially Observable Environments

From proprioception to long-horizon planning in novel environments: A hierarchical RL model

Hierarchical Method for Cooperative Multiagent Reinforcement Learning in Markov Decision Processes

ALMA: Hierarchical Learning for Composite Multi-Agent Tasks

Multi-Agent Reinforcement Learning is a Sequence Modeling Problem

Hierarchical Reinforcement Learning with Opponent Modeling for Distributed Multi-agent Cooperation

Guiding Multi-agent Multi-task Reinforcement Learning by a Hierarchical Framework with Logical Reward Shaping

Hierarchical Consensus-Based Multi-Agent Reinforcement Learning for Multi-Robot Cooperation Tasks

Hybrid Recurrent Models Support Emergent Descriptions for Hierarchical Planning and Control

Hierarchical and Stable Multiagent Reinforcement Learning for Cooperative Navigation Control

Hierarchical Multi-Agent Skill Discovery

Multi-Agent Reinforcement Learning with Selective State-Space Models

The Design and Realization of Multi-agent Obstacle Avoidance based on Reinforcement Learning

HELSA: Hierarchical Reinforcement Learning with Spatiotemporal Abstraction for Large-Scale Multi-Agent Path Finding