Reinforcement learning for encouraging cooperation in a multiagent system

Wei-Cheng Jiang,Hong-Hao Huang,Yu-Teng Wang
DOI: https://doi.org/10.1016/j.ins.2024.120996
IF: 8.1
2024-07-13
Information Sciences
Abstract:Success in cooperative tasks may be compromised if cooperative stagnation and failure from information sharing occur. However, information sharing may require excessive memory. Other methods must be developed to encourage agents to take actions in accordance with the needs of the team. In this study, a method called the cooperative tendency model using Q-learning (CTM-Q) is proposed for a partial-communication multiagent team. Each agent maintains and records its tendency values (encouraging cooperation) and Q-values (encouraging goal-seeking) as input for a payoff function that is used to select actions. Each agent selects the action with the highest payoff value for the current state. The method improves learning performance, enabling agents to rapidly reach a consensus. In simulations, the proposed method accelerated learning for multiagent cooperative applications and outperformed competing methods in solution speed, convergence time, and stability.
computer science, information systems
What problem does this paper attempt to address?