Abstract:This article proposes a novel approach to construct data-driven online solutions to optimization problems (P) subject to a class of distributionally uncertain dynamical systems. The introduced framework allows for the simultaneous learning of distributional system uncertainty via a parameterized, control-dependent ambiguity set using a finite historical dataset, and its use to make online decisions with probabilistic regret function bounds. Leveraging the merits of machine learning, the main technical approach relies on the theory of distributional robust optimization (DRO), to hedge against uncertainty and provide less conservative results than standard robust optimization approaches. Starting from recent results that describe ambiguity sets via parameterized, and control-dependent empirical distributions as well as ambiguity radii, we first present a tractable reformulation of the corresponding optimization problem while maintaining the probabilistic guarantees. We then specialize these problems to the cases of 1) optimal one-stage control of distributionally uncertain nonlinear systems, and 2) resource allocation under distributional uncertainty. A novelty of this work is that it extends DRO to online optimization problems subject to a distributionally uncertain dynamical system constraint, handled via a control-dependent ambiguity set that leads to online-tractable optimization with probabilistic guarantees on regret bounds. Further, we introduce an online version of the Nesterov's accelerated-gradient algorithm, and analyze its performance to solve this class of problems via the dissipativity theory.

Robust $Q$-learning Algorithm for Markov Decision Processes under Wasserstein Uncertainty

Robust Q-Learning for finite ambiguity sets

Safe Wasserstein Constrained Deep Q-Learning

Distributionally Robust Safety Verification for Markov Decision Processes

Wasserstein Distributionally Robust Optimization: Theory and Applications in Machine Learning

A Finite Sample Complexity Bound for Distributionally Robust Q-learning

Robust Reinforcement Learning with Wasserstein Constraint

Distributionally Robust Chance Constrained Games under Wasserstein Ball

Distributionally Robust Density Control with Wasserstein Ambiguity Sets

First-Order Methods for Wasserstein Distributionally Robust MDP

Sample Complexity of Variance-reduced Distributionally Robust Q-learning

Non-concave distributionally robust stochastic control in a discrete time finite horizon setting

Robust Offline Reinforcement Learning for Non-Markovian Decision Processes

Distributionally robust optimization for sequential decision-making

Data-drivenDistributionallyRobustOptimal Stochastic ControlUsing theWassersteinMetric

Differentiable Distributionally Robust Optimization Layers

Robust Probabilistic Prediction for Stochastic Dynamical Systems

Outlier-Robust Wasserstein DRO

Bounding the Difference between the Values of Robust and Non-Robust Markov Decision Problems

Online Optimization and Ambiguity-Based Learning of Distributionally Uncertain Dynamic Systems

Regularized Q-learning through Robust Averaging