Abstract:This paper investigates a Q-learning scheme for the optimal consensus control of discrete-time multiagent systems. The Q-learning algorithm is conducted by reinforcement learning (RL) using system data instead of system dynamics information. In the multiagent systems, the agents are interacted with each other and at least one agent can communicate with the leader directly, which is described by an algebraic graph structure. The objective is to make all the agents achieve synchronization with leader and make the performance indices reach Nash equilibrium. On one hand, the solutions of the optimal consensus control for multiagent systems are acquired by solving the coupled Hamilton–Jacobi–Bellman (HJB) equation. However, it is difficult to get analytical solutions directly of the discrete-time HJB equation. On the other hand, accurate mathematical models of most systems in real world are hard to be obtained. To overcome these difficulties, Q-learning algorithm is developed using system data rather than the accurate system model. We formulate performance index and corresponding Bellman equation of each agent i. Then, the Q-function Bellman equation is acquired on the basis of Q-function. Policy iteration is adopted to calculate the optimal control iteratively, and least square (LS) method is employed to motivate the implementation process. Stability analysis of proposed Q-learning algorithm for multiagent systems by policy iteration is given. Two simulation examples are experimented to verify the effectiveness of the proposed scheme.

Leader-Follower Optimal Consensus of Discrete-Time Linear Multi-agent Systems based on Q-Learning

Consensus Seeking in Multi-Agent Systems with an Active Leader and Communication Delays.

Distributed Reduced-Order Observer-Based Approach to Consensus Problems for Linear Multi-Agent Systems

Multi-agent consensus tracking with initial state error by iterative learning control

Q-learning Solution for Optimal Consensus Control of Discrete-Time Multiagent Systems Using Reinforcement Learning

Multi-Agent Reinforcement Learning Control for Consensus Problems of Uncertain Nonlinear Multi-Agent Systems

Reinforcement Learning Consensus Control for Discrete-Time Multi-Agent Systems

Optimal consensus control for unknown second-order multi-agent systems: Using model-free reinforcement learning method

Neighbor Q‐learning Based Consensus Control for Discrete‐time Multi‐agent Systems

Q -learning algorithm in solving consensusability problem of discrete-time multi-agent systems

Optimal Distributed Leader-Following Consensus of Linear Multi-Agent Systems: A Dynamic Average Consensus-Based Approach

Linear Quadratic Optimal Consensus of Discrete-Time Multi-Agent Systems with Optimal Steady State: A Distributed Model Predictive Control Approach

Optimized leader-follower consensus control for high-order nonlinear multi-agent system modeled in canonical dynamic form

Model‐free distributed optimal control for general discrete‐time linear systems using reinforcement learning

LQR-Based Optimal Leader-Follower Consensus of Second-Order Multi-agent Systems

Leader-Follower Consensus for Multi-Agent Systems with Three-Layer Network Framework and Dynamic Interaction Jointly Connected Topology

Linear Quadratic Leader-following Consensus of Multi-agent Systems: a Decentralized Computation and Distributed Information Fusion Strategy

Consensus Control for A Class of Second-Order Multi-Agent Systems: an Iterative Learning Approach

Data-driven output consensus for a class of discrete-time multiagent systems by reinforcement learning techniques

Data-Based Optimal Consensus Control for Multiagent Systems With Policy Gradient Reinforcement Learning

Finite-time Non-overshooting Leader-following Consensus Control for Multi-Agent Systems