Abstract:We study the task of learning state representations from potentially high-dimensional observations, with the goal of controlling an unknown partially observable system. We pursue a direct latent model learning approach, where a dynamic model in some latent state space is learned by predicting quantities directly related to planning (e.g., costs) without reconstructing the observations. In particular, we focus on an intuitive cost-driven state representation learning method for solving Linear Quadratic Gaussian (LQG) control, one of the most fundamental partially observable control problems. As our main results, we establish finite-sample guarantees of finding a near-optimal state representation function and a near-optimal controller using the directly learned latent model. To the best of our knowledge, despite various empirical successes, prior to this work it was unclear if such a cost-driven latent model learner enjoys finite-sample guarantees. Our work underscores the value of predicting multi-step costs, an idea that is key to our theory, and notably also an idea that is known to be empirically valuable for learning state representations.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is: **Can direct latent model learning solve the linear - quadratic - Gaussian (LQG) control problem?** Specifically, the paper explores how to control an unknown partially observable system by learning state representations from potentially high - dimensional observations. The authors adopt a direct latent model learning method, that is, learning a dynamic model by predicting quantities directly related to planning (such as cost) without reconstructing the observations. ### Problem Background 1. **Control problems of partially observable systems**: - In a partially observable system, the true state of the system cannot be directly observed and can only be inferred from observations and control inputs. - The LQG control problem is one of the classic, partially observable control problems and has important theoretical and practical significance. 2. **Limitations of existing methods**: - **Reconstruction - based methods**: Many existing methods rely on reconstructing observations to learn state representations, but this may lead to high - dimensional and noisy data processing problems, and the reconstructed observations may contain information irrelevant to control. - **Model - free methods**: Model - free methods directly learn policies, but they usually require a large number of samples and have poor generalization ability in complex tasks. ### Core contributions of the paper - **Direct cost - driven state representation learning**: This paper proposes a new method to learn state representations by predicting multi - step cumulative costs, rather than by reconstructing observations or inverse models. This method is more directly related to the control objective. - **Finite - sample guarantee**: The authors prove that in the case of a finite number of samples, a near - optimal state representation function and controller can be found. This is the first time to provide a theoretical finite - sample guarantee for this cost - driven latent model learning method. ### Specific problem description The paper focuses on the following problem: \[ \text{Can direct cost - driven state representation learning effectively solve the LQG control problem?} \] To this end, the authors study a partially observable linear time - varying (LTV) dynamic system: \[ x_{t + 1}=A_t^* x_t + B_t^* u_t+w_t, \quad y_t = C_t^* x_t + v_t, \] where \( x_t \) is the state, \( y_t \) is the observation, \( u_t \) is the control input, \( w_t \) and \( v_t \) are process noise and observation noise respectively. The goal is to minimize the cumulative cost by learning state representations given observations and control inputs: \[ c_t(x, u)=\|x\|^2_{Q_t^*}+\|u\|^2_{R_t^*}, \] and finally find an optimal control strategy. ### Conclusion Through strict theoretical analysis and experimental verification, the paper proves that the direct cost - driven state representation learning method can effectively solve the LQG control problem with a finite number of samples. This result provides an important theoretical basis and technical support for future research and applications.

Can Direct Latent Model Learning Solve Linear Quadratic Gaussian Control?

Learning-Based Optimal Control with Performance Guarantees for Unknown Systems with Latent States

Imitation and Transfer Learning for LQG Control

Learning Algorithm for LQG Model with Constrained Control

Learning the Linear Quadratic Regulator from Nonlinear Observations

Latent Linear Quadratic Regulator for Robotic Control Tasks

Dual Control with Active Learning using Gaussian Process Regression

Causality-Informed Data-Driven Predictive Control

Observation-based Optimal Control Law Learning with LQR Reconstruction

Tracking control of latent dynamic systems with application to spacecraft attitude control

Learning Robust Data-based LQG Controllers from Noisy Data

Analysis of the Optimization Landscape of Linear Quadratic Gaussian (LQG) Control

Optimal Adaptive Control of Linear Stochastic Systems with Quadratic Cost Function

Distributionally Robust Linear Quadratic Control

Direct Data-Driven Discounted Infinite Horizon Linear Quadratic Regulator with Robustness Guarantees

Towards Learning Controllable Representations of Physical Systems

Regret Analysis of Multi-task Representation Learning for Linear-Quadratic Adaptive Control

Controlling Unknown Quantum States via Data-Driven State Representations

Correct-by-construction reach-avoid control of partially observable linear stochastic systems

Efficient Reinforcement Learning for High Dimensional Linear Quadratic Systems