Abstract:We consider a recently proposed class of MCMC methods which uses proximity maps instead of gradients to build proposal mechanisms which can be employed for both differentiable and non-differentiable targets. These methods have been shown to be stable for a wide class of targets, making them a valuable alternative to Metropolis-adjusted Langevin algorithms (MALA); and have found wide application in imaging contexts. The wider stability properties are obtained by building the Moreau-Yosida envelope for the target of interest, which depends on a parameter $\lambda$. In this work, we investigate the optimal scaling problem for this class of algorithms, which encompasses MALA, and provide practical guidelines for the implementation of these methods.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is how to determine the optimal scaling parameters to improve the efficiency of the Metropolis - adjusted Langevin algorithm with Moreau - Yosida regularization (MY - MALA) when sampling high - dimensional target distributions. Specifically, the paper focuses on how to adjust the step - size parameter $\sigma^2$ and the regularization parameter $\lambda$ according to the dimension of the target distribution to achieve optimal algorithm performance. ### Main Research Contents 1. **Extending Existing Results**: - The paper first extends the conclusions regarding Gaussian distributions in [34] and generalizes them to a broader class of finite - dimensional (sufficiently smooth) target distributions. - Through theoretical analysis, the paper shows that in some cases, appropriate selection of the regularization parameter $\lambda$ can make MY - MALA have the same scaling properties as MALA. 2. **Optimal Scaling for Non - smooth Target Distributions**: - The paper specifically studies the application of MY - MALA to the Laplace distribution, which is a common non - smooth target distribution. - The results show that for the Laplace distribution, the optimal scaling properties of MY - MALA are different from those of MALA. Specifically, the step - size $\sigma^2$ needs to decay as $d^{-2/3}$, rather than $d^{-1/3}$ for MALA. 3. **Practical Guidance**: - Based on the above theoretical results, the paper provides practical guidance for selecting the regularization parameter $\lambda$ to optimize the performance of MY - MALA. ### Theoretical Contributions - **Comparison of Different Target Distributions**: - For continuously differentiable target distributions, the paper shows how the scaling properties of MY - MALA depend on the relative decay rates of $\lambda$ and $\sigma^2$. - When $\lambda$ decays faster than $\sigma^2$, the performance of MY - MALA is close to that of MALA; when $\lambda$ decays more slowly, the performance of MY - MALA will be affected. - **Analysis of Non - smooth Target Distributions**: - For the Laplace distribution, the paper proves the influence of non - smoothness on optimal scaling and gives specific scaling formulas. ### Experimental Verification - The paper verifies the theoretical results through numerical experiments and shows the performance of MY - MALA under different target distributions. ### Conclusions - This paper provides a theoretical basis and practical guidance for the optimal parameter selection of MY - MALA, especially when dealing with non - smooth target distributions, providing new insights for improving algorithm efficiency. Through these studies, the paper not only extends the existing theoretical framework but also provides practical guidance for parameter selection in practical applications.

Optimal Scaling Results for Moreau-Yosida Metropolis-adjusted Langevin Algorithms

On the Computational Complexity of Metropolis-Adjusted Langevin Algorithms for Bayesian Posterior Sampling

Efficient stochastic optimisation by unadjusted Langevin Monte Carlo. Application to maximum marginal likelihood and empirical Bayesian estimation

Efficient Bayesian Computation by Proximal Markov Chain Monte Carlo: When Langevin Meets Moreau

AdamMCMC: Combining Metropolis Adjusted Langevin with Momentum-based Optimization

Tuning of MCMC with Langevin, Hamiltonian, and other stochastic autoregressive proposals

Local Optimality and Generalization Guarantees for the Langevin Algorithm via Empirical Metastability

When is the Convergence Time of Langevin Algorithms Dimension Independent? A Composite Optimization Viewpoint

Convergence of Dirichlet Forms for MCMC Optimal Scaling with Dependent Target Distributions on Large Graphs

Parallelized Midpoint Randomization for Langevin Monte Carlo

Unbiased Kinetic Langevin Monte Carlo with Inexact Gradients

Bregman Proximal Langevin Monte Carlo via Bregman--Moreau Envelopes

Rate-optimal refinement strategies for local approximation MCMC

Scalable couplings for the random walk Metropolis algorithm

Smoothing unadjusted Langevin algorithms for nonsmooth composite potential functions

Scalable Stochastic Gradient Riemannian Langevin Dynamics in Non-Diagonal Metrics

Scalability of Metropolis-within-Gibbs schemes for high-dimensional Bayesian models

Kinetic Theories for Metropolis Monte Carlo Methods

Benchmark of Schemes for Multiscale Molecular Dynamics Simulations

Alternative representation of the large deviation rate function and hyperparameter tuning schemes for Metropolis-Hastings Markov Chains