Abstract:Reinforcement Learning (RL) is an efficient learning method for solving problems that learning agents have no knowledge about the environment a priori. Ant Colony System (ACS) provides an indirect communication method among cooperating agents, which is an efficient method for solving combinatorial optimization problems. Based on the cooperating method of the indirect communication in ACS and the update policy of reinforcement values in RL, this paper proposes the Q-ACS multiagent cooperating learning method that can be applied to both Markov Decision Processes (MDPs) and combinatorial optimization problems. The advantage of the Q-ACS method is for the learning agents to share episodes beneficial to the exploitation of the accumulated knowledge and utilize the learned reinforcement values efficiently. Further, taking the visited times into account, this paper proposes the T-ACS multiagent learning method. The merit of the T-ACS method is that the learning agents share better policies beneficial to the exploration during agent's learning processes. Meanwhile, considering the Q-ACS and the T-ACS as homogeneous multiagent learning methods, in the light of indirect media communication among heterogeneous multiagent, this paper presents a heterogeneous multiagent RL method, the D-ACS that composites the learning policy of the Q-ACS and the T-ACS, and takes different updating policies of reinforcement values. The agents in our methods are given a simply cooperating way exchanging information in the form of reinforcement values updated in the common model of all agents. Owning the advantages of exploring the unknown environment actively and exploiting learned knowledge effectively, the proposed methods are able to solve both problems with MDPs and combinatorial optimization problems effectively. The results of experiments on hunter game and traveling salesman problem demonstrate that our methods perform competitively with representative methods on each domain respectively.

Reinforcement Learning with Task Decomposition for Cooperative Multiagent Systems.

Learning Reward Machines in Cooperative Multi-Agent Tasks

Hierarchical Multi-Agent Reinforcement Learning for Cooperative Tasks with Sparse Rewards in Continuous Domain

Semantically Aligned Task Decomposition in Multi-Agent Reinforcement Learning

LDSA: Learning Dynamic Subtask Assignment in Cooperative Multi-Agent Reinforcement Learning

Multi-Agent Reinforcement Learning for Problems with Combined Individual and Team Reward

Hierarchical Reinforcement Learning with Opponent Modeling for Distributed Multi-agent Cooperation

Multiagent Cooperating Learning Methods by Indirect Media Communication.

Guiding Multi-agent Multi-task Reinforcement Learning by a Hierarchical Framework with Logical Reward Shaping

A Multitier Reinforcement Learning Model for a Cooperative Multiagent System

Qatten: A General Framework for Cooperative Multiagent Reinforcement Learning

Credit assignment in heterogeneous multi-agent reinforcement learning for fully cooperative tasks

Fully Decentralized Cooperative Multi-Agent Reinforcement Learning: A Survey

Learning Complex Teamwork Tasks Using a Given Sub-task Decomposition

Skills Regularized Task Decomposition for Multi-task Offline Reinforcement Learning

Exploring Task-oriented Communication in Multi-agent System: A Deep Reinforcement Learning Approach

Relation-Aware Learning for Multi-Task Multi-Agent Cooperative Games

Modeling the Interaction Between Agents in Cooperative Multi-Agent Reinforcement Learning

A Hierarchical Framework for Cooperative Tasks in Multi-agent Systems

A Cooperative Multi-Agent Reinforcement Learning Method Based on Coordination Degree

Cooperative Multi-Robot Task Allocation with Reinforcement Learning