Abstract:Trajectory inference seeks to recover the temporal dynamics of a population from snapshots of its (uncoupled) temporal marginals, i.e. where observed particles are not tracked over time. Lavenant et al. <a class="link-https" data-arxiv-id="2102.09204" href="https://arxiv.org/abs/2102.09204">arXiv:2102.09204</a> addressed this challenging problem under a stochastic differential equation (SDE) model with a gradient-driven drift in the observed space, introducing a minimum entropy estimator relative to the Wiener measure. Chizat et al. <a class="link-https" data-arxiv-id="2205.07146" href="https://arxiv.org/abs/2205.07146">arXiv:2205.07146</a> then provided a practical grid-free mean-field Langevin (MFL) algorithm using Schrödinger bridges. Motivated by the overwhelming success of observable state space models in the traditional paired trajectory inference problem (e.g. target tracking), we extend the above framework to a class of latent SDEs in the form of observable state space models. In this setting, we use partial observations to infer trajectories in the latent space under a specified dynamics model (e.g. the constant velocity/acceleration models from target tracking). We introduce PO-MFL to solve this latent trajectory inference problem and provide theoretical guarantees by extending the results of <a class="link-https" data-arxiv-id="2102.09204" href="https://arxiv.org/abs/2102.09204">arXiv:2102.09204</a> to the partially observed setting. We leverage the MFL framework of <a class="link-https" data-arxiv-id="2205.07146" href="https://arxiv.org/abs/2205.07146">arXiv:2205.07146</a>, yielding an algorithm based on entropic OT between dynamics-adjusted adjacent time marginals. Experiments validate the robustness of our method and the exponential convergence of the MFL dynamics, and demonstrate significant outperformance over the latent-free method of <a class="link-https" data-arxiv-id="2205.07146" href="https://arxiv.org/abs/2205.07146">arXiv:2205.07146</a> in key scenarios.

Generalized Maximum Entropy Differential Dynamic Programming

Second-Order Stein Variational Dynamic Optimization

Entropy Regularised Deterministic Optimal Control: From Path Integral Solution to Sample-Based Trajectory Optimisation

Ergodic Trajectory Optimization on Generalized Domains Using Maximum Mean Discrepancy

Deterministic Trajectory Optimization through Probabilistic Optimal Control

Maximum Entropy Density Control of Discrete-Time Linear Systems with Quadratic Cost

Beyond Exact Gradients: Convergence of Stochastic Soft-Max Policy Gradient Methods with Entropy Regularization

Exploratory Control with Tsallis Entropy for Latent Factor Models

Maximum entropy optimal density control of discrete-time linear systems and Schrödinger bridges

Tsallis Entropy Regularization for Linearly Solvable MDP and Linear Quadratic Regulator

Dynamic Programming-based Approximate Optimal Control for Model-Based Reinforcement Learning

Stochastic trajectory optimization for mechanical systems with parametric uncertainties

Differential Dynamic Programming for time-delayed systems

Partially Observed Trajectory Inference using Optimal Transport and a Dynamics Prior

Transfer Entropy in MDPs with Temporal Logic Specifications

Variational Dynamic Programming for Stochastic Optimal Control

Improved Exploration for Safety-Embedded Differential Dynamic Programming Using Tolerant Barrier States

Epsilon-Greedy Thompson Sampling to Bayesian Optimization

Decentralized trajectory optimization for multi-agent exploration

Deep Gaussian Covariance Network with Trajectory Sampling for Data-Efficient Policy Search