A Distributed Deep Koopman Learning Algorithm for Control

Wenjian Hao,Zehui Lu,Devesh Upadhyay,Shaoshuai Mou
2024-12-10
Abstract:This paper proposes a distributed data-driven framework to address the challenge of dynamics learning from a large amount of training data for optimal control purposes, named distributed deep Koopman learning for control (DDKC). Suppose a system states-inputs trajectory and a multi-agent system (MAS), the key idea of DDKC is to assign each agent in MAS an offline partial trajectory, and each agent approximates the unknown dynamics linearly relying on the deep neural network (DNN) and Koopman operator theory by communicating information with other agents to reach a consensus of the approximated dynamics for all agents in MAS. Simulations on a surface vehicle first show that the proposed method achieves the consensus in terms of the learned dynamics and the learned dynamics from each agent can achieve reasonably small estimation errors over the testing data. Furthermore, simulations in combination with model predictive control (MPC) to drive the surface vehicle for goal-tracking and station-keeping tasks demonstrate the learned dynamics from DDKC are precise enough to be used for the optimal control design.
Systems and Control
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is how to learn the dynamic model of the system from a large amount of training data through a distributed algorithm in a multi - agent system (MAS) to achieve optimal control. Specifically, the paper proposes a framework named Distributed Deep Koopman Learning (DDKC), aiming to address the following key challenges: 1. **Large - scale data processing**: When facing a large number of state - input pairs, it is difficult for a single agent to process these data effectively. Therefore, a distributed method is required to share the computational burden. 2. **Partially observed trajectories**: Each agent can only observe a part of the system's trajectory, rather than the complete trajectory. This requires cooperation among agents to reach a consistent estimate of the system dynamics. 3. **Linear representation of nonlinear systems**: Using Koopman operator theory, the dynamics of a nonlinear time - invariant system (NTIS) are approximated in a linear form, thereby simplifying the controller design. ### Specific problem description Consider a discrete - time - invariant system: \[x(t + 1)=f(x(t), u(t)),\] where \(t = 0, 1, 2,\ldots\) represents the time index, \(x(t)\in\mathbb{R}^n\) and \(u(t)\in\mathbb{R}^m\) represent the system state and control input respectively, and \(f:\mathbb{R}^n\times\mathbb{R}^m\rightarrow\mathbb{R}^n\) is an unknown time - invariant mapping function. Given a system state - input trajectory from time 0 to \(T\): \[\xi=\{(x(t), u(t))\}_{t = 0}^T,\] ### Deep Koopman operator (DKO) By introducing Koopman operator theory, a discrete - time dynamic system can be constructed: \[g(\hat{x}(t + 1),\theta)=A g(\hat{x}(t),\theta)+B\hat{u}(t),\] \[\hat{x}(t + 1)=C g(\hat{x}(t + 1),\theta),\] where \(g(\cdot,\theta):\mathbb{R}^n\times\mathbb{R}^p\rightarrow\mathbb{R}^r\) is a mapping function represented by a deep neural network (DNN) with parameter \(\theta\in\mathbb{R}^p\), and \(A\in\mathbb{R}^{r\times r}\), \(B\in\mathbb{R}^{r\times m}\), \(C\in\mathbb{R}^{n\times r}\) are constant matrices. Combining the above equations, the estimated system dynamics can be obtained: \[\hat{x}(t + 1)=\hat{f}(\hat{x}(t),\hat{u}(t),\theta)=C(A g(\hat{x}(t),\theta)+B\hat{u}(t)).\] ### Multi - agent system (MAS) Suppose there is a MAS containing \(N\geq1\) agents. Each agent \(i\) can only observe a partial trajectory \(\xi_i\), and agents can exchange information through communication. The goal is to enable each agent to learn a consistent dynamic model through cooperation. ### Paper contributions 1. A distributed deep Koopman learning algorithm (DDKC) is proposed, which enables each agent to manage a set of parameters \(K_i = \{A_i, B_i, C_i,\theta_i\}\) in the MAS and cooperatively optimize these parameters by exchanging information with neighbors. 2. The effectiveness of model - predictive control (MPC) based on DDKC in target - tracking and fixed - point - holding tasks of surface vehicles is verified through simulation. Through these methods, the paper shows...