Reinforcement Learning for Jointly Optimal Coding and Control over a Communication Channel

Evelyn Hubbard,Liam Cregg,Serdar Yüksel
2024-11-21
Abstract:We develop rigorous approximation and near optimality results for the optimal control of a system which is connected to a controller over a finite rate noiseless channel. While structural results on the optimal encoding and control have been obtained in the literature, their implementation has been prohibitive in general, except for linear models. We develop regularity and structural properties, followed by approximations and reinforcement learning results. Notably, we establish near optimality of finite model approximations as well as sliding finite window coding policies and their reinforcement learning convergence to near optimality.
Optimization and Control
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: in networked control systems, between the encoder and the controller connected via a finite - rate noiseless channel, how to jointly optimize the coding and control strategies to achieve optimal performance. Specifically, the paper aims to develop strict approximation and near - optimal results, especially for the research on approximation to near - optimal via finite - model approximation, sliding finite - window coding strategies and their convergence to near - optimal in reinforcement learning. ### Problem Background Networked control systems (NCS) involve stochastic control systems with communication channels between different sites (such as sensors, actuators and controllers). In this case, a large amount of literature has studied the stability and optimization problems of these systems under various information constraints. However, although the structural results are very useful in theory, their implementation is usually computationally challenging, especially for nonlinear models. ### Main Contributions of the Paper 1. **Strict Approximation and Near - optimal Results**: The paper has developed strict approximation methods and proved the near - optimal of finite - model approximation. 2. **Sliding Finite - window Coding Strategy**: A sliding finite - window coding strategy has been introduced and its near - optimal has been proved. 3. **Reinforcement Learning Algorithm**: A reinforcement learning algorithm that can be proven to be close to optimal has been proposed for optimizing coding and control strategies. ### Specific Problem Description The paper considers a networked control problem, in which the controlled Markov source is observed through a noiseless communication channel and controlled using the data obtained from this channel. The system dynamic equation is: \[ x_{t + 1}=f(x_t, u_t, w_t) \] where: - \( x_t \) is the state defined on the finite - state space \( X \); - \( u_t \) is the control action defined on the finite - action space \( U \); - \( w_t \) is an independently and identically distributed noise process; - \( x_0 \) is a random variable with an initial distribution \( \pi_0 \). At each time stage \( t \), \( x_t \) is causally encoded through the noiseless channel. The controller receives the information from the communication channel and selects a control action \( u_t \), and then transmits it to the plant. The encoder and the controller use coding and control strategies respectively to calculate their outputs. ### Optimization Objective For a finite - time horizon \( N \), the optimization objective is defined as: \[ J_N(\pi_0)=\inf_{\gamma\in\Gamma_A}E_\gamma^{\pi_0}\left[\sum_{k = 0}^{N - 1}c(x_k, u_k)\right] \] where \( E_\gamma^{\pi_0} \) represents the expectation of \((x_t, u_t)_{t\geq0}\) under the initial distribution \( \pi_0 \) and the joint coding - control strategy \( \gamma \). ### Conclusion Through strict mathematical derivations and theoretical analyses, the paper proposes a new method to solve the joint coding and control optimization problem in networked control systems. This method not only provides strict approximation and near - optimal results in theory, but also demonstrates the feasibility in practical applications through the reinforcement learning algorithm.