Event-Triggered ADP for Nonzero-Sum Games of Unknown Nonlinear Systems

Qingtao Zhao,Jian Sun,Gang Wang,Jie Chen
DOI: https://doi.org/10.1109/tnnls.2021.3071545
IF: 14.255
2021-01-01
IEEE Transactions on Neural Networks and Learning Systems
Abstract:For nonzero-sum (NZS) games of nonlinear systems, reinforcement learning (RL) or adaptive dynamic programming (ADP) has shown its capability of approximating the desired index performance and the optimal input policy iteratively. In this article, an event-triggered ADP is proposed for NZS games of continuous-time nonlinear systems with completely unknown system dynamics. To achieve the Nash equilibrium solution approximately, the critic neural networks and actor neural networks are utilized to estimate the value functions and the control policies, respectively. Compared with the traditional time-triggered mechanism, the proposed algorithm updates the neural network weights as well as the inputs of players only when a state-based event-triggered condition is violated. It is shown that the system stability and the weights' convergence are still guaranteed under mild assumptions, while occupation of communication and computation resources is considerably reduced. Meanwhile, the infamous Zeno behavior is excluded by proving the existence of a minimum inter-event time (MIET) to ensure the feasibility of the closed-loop event-triggered continuous-time system. Finally, a numerical example is simulated to illustrate the effectiveness of the proposed approach.
computer science, artificial intelligence, theory & methods,engineering, electrical & electronic, hardware & architecture
What problem does this paper attempt to address?
### Problems Addressed by the Paper This paper aims to address the optimal control problem of unknown nonlinear systems in nonzero-sum (NZS) games. Specifically, the paper proposes an event-triggered Adaptive Dynamic Programming (ADP) method for multi-player NZS games in continuous-time nonlinear systems. Traditional time-triggered mechanisms update neural network weights and control inputs at every sampling instant, leading to a waste of communication and computational resources. The event-triggered mechanism proposed in this paper updates weights and control inputs only when the system state changes exceed a certain threshold, significantly reducing resource consumption. ### Main Contributions 1. **Reduction of Ineffective Learning**: Compared to existing methods, the algorithm in this paper updates network weights and control inputs only when the system state changes significantly, avoiding learning due to ineffective noise. 2. **Completely Unknown System Model**: Existing methods usually assume that the system model is known or partially known. The method in this paper can handle completely unknown nonlinear systems without using identifier techniques. 3. **Theoretical Guarantees**: The paper proves several important properties of the proposed algorithm in terms of system stability and weight convergence, eliminating the well-known Zeno behavior and providing theoretical guarantees for performance-critical applications. ### Method Overview - **System Model**: Consider a continuous-time nonlinear system with multiple control inputs. - **Performance Index**: Define a quadratic performance index for each player. - **Nash Equilibrium**: The goal is to find the optimal control strategy that minimizes the performance index, corresponding to the Nash equilibrium. - **Event-Triggered Condition**: Design an event-triggered condition that triggers updates when the system state changes exceed a threshold. - **Neural Network Structure**: Use Critic Networks and Actor Networks to estimate the value function and control strategy, respectively. - **Weight Update**: Update the output layer weights of the neural networks only when an event is triggered. ### Theoretical Analysis - **System Stability**: Prove the asymptotic stability of the closed-loop system using the Lyapunov function method. - **Weight Convergence**: Prove the convergence of the neural network weights. - **Zeno Behavior Elimination**: Eliminate Zeno behavior by proving the existence of a Minimum Inter-Event Time (MIET). ### Simulation Validation - **Numerical Example**: Validate the effectiveness of the proposed method through a numerical example, demonstrating the performance comparison between the event-triggered ADP controller and the time-triggered ADP controller. In summary, this paper proposes an effective event-triggered ADP method to solve the optimal control problem of multi-player NZS games in unknown nonlinear systems, with significant theoretical and practical application value.