Abstract:Optimal control problems with state distribution constraints have attracted interest for their expressivity, but solutions rely on linear approximations. We approach the problem of driving the state of a dynamical system in distribution from a sequential decision-making perspective. We formulate the optimal control problem as an appropriate Markov decision process (MDP), where the actions correspond to the state-feedback control policies. We then solve the MDP using Monte Carlo tree search (MCTS). This renders our method suitable for any dynamics model. A key component of our approach is a novel, easy to compute, distance metric in the distribution space that allows our algorithm to guide the distribution of the state. We experimentally test our algorithm under both linear and nonlinear dynamics.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is: **How to shift the state distribution of a dynamic system from an initial distribution to a target distribution (Distribution Steering) in discrete time**. Specifically, the author proposes a method based on Monte Carlo Tree Search (MCTS) to solve this problem, and this method is applicable to linear and nonlinear dynamic systems. ### Problem Background In many practical applications, such as robot swarms, spacecraft control, and mean - field stochastic control, we need not only to control the state of the system but also to ensure that these states follow a specific probability distribution. Traditional optimal control methods usually assume that the state must belong to a given set, while the distribution steering problem allows us to specify the state distribution more flexibly. This makes the distribution steering problem more natural and effective when dealing with uncertainty. ### Specific Problem Description 1. **Uncertainty of the Initial State**: The initial state of a dynamic system may be uncertain and can be represented by a probability distribution. 2. **Target Distribution**: We need to design a control strategy so that the state distribution of the system gradually approaches a preset target distribution. 3. **Challenges**: Most of the existing methods rely on linear approximations, cannot directly handle complex nonlinear dynamic systems, and are difficult to perform effective planning in continuous space. ### Main Contributions of the Paper 1. **Markov Decision Process (MDP) Modeling**: The author models the discrete - time distribution steering problem as an MDP and uses MCTS to solve this problem online. In this way, any dynamic model can be processed. 2. **New Distance Metric**: A new and easy - to - calculate distance metric is introduced to measure the similarity between two distributions. This metric is defined by comparing the probability content of the distributions in a set of half - spaces. 3. **Experimental Verification**: Experiments are carried out on systems with linear and nonlinear dynamics to verify the effectiveness of the proposed algorithm. ### Summary of Mathematical Formulas - State transition equation of the dynamic system: \[ x_{t + 1}=f(x_t,\pi_t(x_t),w_t) \] where \(x_t\) is the state, \(\pi_t\) is the control strategy, and \(w_t\) is the noise term. - Objective function: \[ \minimize_{\pi_t,\forall t\in[N - 1]}E\left[\sum_{t = 1}^{N - 1}c_t(x_t,\pi_t,x_{t + 1})+D(\mu_N,\mu_f)\right] \] where \(D(\mu_N,\mu_f)\) measures the distance between the final state distribution \(\mu_N\) and the target distribution \(\mu_f\). - New distance metric: \[ D(\mu,\nu)\triangleq E_{q,b}\left|E_{x\sim\mu}1_{q^Tx + b\geq0}-E_{y\sim\nu}1_{q^Ty + b\geq0}\right| \] This metric evaluates the difference between two distributions by randomly sampling half - spaces. Through these methods, the paper provides a general and efficient solution that can achieve distribution steering in complex environments.

Discrete-Time Distribution Steering using Monte Carlo Tree Search

Monte Carlo tree search control scheme for multibody dynamics applications

Enhancing Feedback Steering Controllers for Autonomous Vehicles With Deep Monte Carlo Tree Search

Discrete-Time Maximum Likelihood Neural Distribution Steering

Monte Carlo Planning for Stochastic Control on Constrained Markov Decision Processes

An Efficient Dynamic Sampling Policy for Monte Carlo Tree Search.

Mixed-Integer Path-Stable Optimisation, with Applications in Model-Predictive Control of Water Systems

Solving Stochastic Orienteering Problems with Chance Constraints Using Monte Carlo Tree Search

Stochastic Maintenance Schedules of Active Distribution Networks Based on Monte-Carlo Tree Search

Monte-Carlo tree search with uncertainty propagation via optimal transport

A Distributionally Robust Optimization based Method for Stochastic Model Predictive Control

Monte Carlo Tree Search: a review of recent modifications and applications

Optimal state space reconstruction via Monte Carlo decision tree search

Estimation and Control Using Sampling-Based Bayesian Reinforcement Learning

Distributionally Robust Stochastic Data-Driven Predictive Control with Optimized Feedback Gain

Optimized Monte Carlo Tree Search for Enhanced Decision Making in the FrozenLake Environment

Density Steering of Gaussian Mixture Models for Discrete-Time Linear Systems

Provably Efficient Long-Horizon Exploration in Monte Carlo Tree Search through State Occupancy Regularization

A Multilevel Approach for Stochastic Nonlinear Optimal Control

Distributionally Robust Infinite-horizon Control: from a pool of samples to the design of dependable controllers

Distribution Steering for Discrete-Time Uncertain Ensemble Systems