Abstract:We consider the constrained sampling problem where the goal is to sample from a target distribution $\pi(x)\propto e^{-f(x)}$ when $x$ is constrained to lie on a convex body $\mathcal{C}$. Motivated by penalty methods from continuous optimization, we propose penalized Langevin Dynamics (PLD) and penalized underdamped Langevin Monte Carlo (PULMC) methods that convert the constrained sampling problem into an unconstrained sampling problem by introducing a penalty function for constraint violations. When $f$ is smooth and gradients are available, we get $\tilde{\mathcal{O}}(d/\varepsilon^{10})$ iteration complexity for PLD to sample the target up to an $\varepsilon$-error where the error is measured in the TV distance and $\tilde{\mathcal{O}}(\cdot)$ hides logarithmic factors. For PULMC, we improve the result to $\tilde{\mathcal{O}}(\sqrt{d}/\varepsilon^{7})$ when the Hessian of $f$ is Lipschitz and the boundary of $\mathcal{C}$ is sufficiently smooth. To our knowledge, these are the first convergence results for underdamped Langevin Monte Carlo methods in the constrained sampling that handle non-convex $f$ and provide guarantees with the best dimension dependency among existing methods with deterministic gradient. If unbiased stochastic estimates of the gradient of $f$ are available, we propose PSGLD and PSGULMC methods that can handle stochastic gradients and are scaleable to large datasets without requiring Metropolis-Hasting correction steps. For PSGLD and PSGULMC, when $f$ is strongly convex and smooth, we obtain $\tilde{\mathcal{O}}(d/\varepsilon^{18})$ and $\tilde{\mathcal{O}}(d\sqrt{d}/\varepsilon^{39})$ iteration complexity in W2 distance. When $f$ is smooth and can be non-convex, we provide finite-time performance bounds and iteration complexity results. Finally, we illustrate the performance on Bayesian LASSO regression and Bayesian constrained deep learning problems.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is the problem of sampling from the target distribution $\pi(x) \propto e^{-f(x)}$ on a convex body $C \subset \mathbb{R}^d$. Specifically, when the variable $x$ is restricted within a convex body $C$, how to efficiently generate samples from this target distribution. This problem has wide applications in Bayesian statistical inference, Bayesian formulation of inverse problems, and classification and regression tasks in machine learning. To deal with the constrained sampling problem, the authors propose two penalty - function - based methods: Penalized Langevin Dynamics (PLD) and Penalized Underdamped Langevin Monte Carlo (PULMC). These methods transform the constrained sampling problem into an unconstrained sampling problem by introducing a penalty function, thus avoiding the complexity of directly dealing with the constraint set $C$. ### Main contributions 1. **Iteration complexity analysis**: - For the case where the function is smooth and the gradient is available, the iteration complexity of the PLD method is $\tilde{O}(d/\epsilon^{10})$, where the error $\epsilon$ is measured by the total variation distance. - For the PULMC method, when the Hessian matrix is Lipschitz continuous and the boundary $C$ is sufficiently smooth, the iteration complexity is improved to $\tilde{O}(\sqrt{d}/\epsilon^7)$. 2. **Stochastic gradient support**: - When only unbiased stochastic gradient estimates are available, the authors propose the PSGLD and PSGULMC methods. These methods can handle stochastic gradients in large - scale Bayesian learning problems and do not require the Metropolis - Hasting correction step. - For strongly convex and smooth $f$, the iteration complexities of PSGLD and PSGULMC are $\tilde{O}(d/\epsilon^{18})$ and $\tilde{O}(d\sqrt{d}/\epsilon^{39})$ respectively. - For the more general case, that is, when $f$ is smooth but may be non - convex, finite - time performance bounds and iteration complexity results are also provided. 3. **Practical applications**: - The authors demonstrate the performance of the proposed algorithms through experiments on Bayesian LASSO regression and Bayesian constrained deep learning problems. ### Technical contributions - **KL divergence and Csiszár - Kullback - Pinsker inequality**: Through careful technical analysis, the authors bound the Kullback - Leibler (KL) divergence between $\pi_\delta$ and $\pi$, and apply the weighted Csiszár - Kullback - Pinsker inequality to control the 2 - Wasserstein distance between $\pi_\delta$ and $\pi$. - **Regularization and strong convexity**: By regularizing the constraint domain $C$, the regularized domain $C_\alpha$ is made $\alpha$-strongly convex, and it is proved that $f + S_\alpha/\delta$ is strongly convex outside a compact domain. - **Dissipativity condition**: For the case of non - convex $f$ and stochastic gradients, the authors prove that $f + S/\delta$ satisfies the dissipativity condition, which is a key technical result for applying the convergence of the unconstrained Langevin algorithm for non - convex targets in the existing literature. In general, this paper effectively solves the constrained sampling problem by introducing the penalty - function - based method, provides a detailed convergence rate analysis theoretically, and demonstrates its effectiveness in practical applications.

Penalized Overdamped and Underdamped Langevin Monte Carlo Algorithms for Constrained Sampling

Constrained Sampling with Primal-Dual Langevin Monte Carlo

Projected Langevin Monte Carlo algorithms in non-convex and super-linear setting

Penalized Langevin dynamics with vanishing penalty for smooth and log-concave targets

Gradient-adjusted underdamped Langevin dynamics for sampling

Laplacian Smoothing Stochastic Gradient Markov Chain Monte Carlo

Non-Log-Concave and Nonsmooth Sampling via Langevin Monte Carlo Algorithms

User-friendly guarantees for the Langevin Monte Carlo with inaccurate gradient

Double Randomized Underdamped Langevin with Dimension-Independent Convergence Guarantee.

Mean-Square Analysis with An Application to Optimal Dimension Dependence of Langevin Monte Carlo

Sharp convergence rates for Langevin dynamics in the nonconvex setting

Unbiased Kinetic Langevin Monte Carlo with Inexact Gradients

Langevin Monte Carlo for strongly log-concave distributions: Randomized midpoint revisited

Underdamped Langevin MCMC: A non-asymptotic analysis

A Dynamical System View of Langevin-Based Non-Convex Sampling

Efficient Sampling on Riemannian Manifolds via Langevin MCMC

Parallelized Midpoint Randomization for Langevin Monte Carlo

Analysis of Langevin Monte Carlo from Poincaré to Log-Sobolev

Constrained Ensemble Langevin Monte Carlo

Provable Benefit of Annealed Langevin Monte Carlo for Non-log-concave Sampling