Learning Automata-Based Multiagent Reinforcement Learning for Optimization of Cooperative Tasks
Zhen Zhang,Dongqing Wang,Junwei Gao
DOI: https://doi.org/10.1109/tnnls.2020.3025711
IF: 14.255
2021-10-01
IEEE Transactions on Neural Networks and Learning Systems
Abstract:Multiagent reinforcement learning (MARL) has been extensively used in many applications for its tractable implementation and task distribution. Learning automata, which can be classified under MARL in the category of independent learner, are used to obtain the optimal joint action or some type of equilibrium. Learning automata have the following advantages. First, learning automata do not require any agent to observe the action of any other agent. Second, learning automata are simple in structure and easy to be implemented. Learning automata have been applied to function optimization, image processing, data clustering, recommender systems, and wireless sensor networks. However, a few learning automata-based algorithms have been proposed for optimization of cooperative repeated games and stochastic games. We propose an algorithm known as learning automata for optimization of cooperative agents (LA-OCA). To make learning automata applicable to cooperative tasks, we transform the environment to a P-model by introducing an indicator variable whose value is one when the maximal reward is obtained and is zero otherwise. Theoretical analysis shows that all the strict optimal joint actions are stable critical points of the model of LA-OCA in cooperative repeated games with an arbitrary finite number of players and actions. Simulation results show that LA-OCA obtains the pure optimal joint strategy with a success rate of 100% in all of the three cooperative tasks and outperforms the other algorithms in terms of learning speed.
computer science, artificial intelligence, theory & methods,engineering, electrical & electronic, hardware & architecture