Optimal Scaling Results for Moreau-Yosida Metropolis-adjusted Langevin Algorithms

Francesca R. Crucinio,Alain Durmus,Pablo Jiménez,Gareth O. Roberts
2024-06-19
Abstract:We consider a recently proposed class of MCMC methods which uses proximity maps instead of gradients to build proposal mechanisms which can be employed for both differentiable and non-differentiable targets. These methods have been shown to be stable for a wide class of targets, making them a valuable alternative to Metropolis-adjusted Langevin algorithms (MALA); and have found wide application in imaging contexts. The wider stability properties are obtained by building the Moreau-Yosida envelope for the target of interest, which depends on a parameter $\lambda$. In this work, we investigate the optimal scaling problem for this class of algorithms, which encompasses MALA, and provide practical guidelines for the implementation of these methods.
Computation,Probability,Statistics Theory
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to determine the optimal scaling parameters to improve the efficiency of the Metropolis - adjusted Langevin algorithm with Moreau - Yosida regularization (MY - MALA) when sampling high - dimensional target distributions. Specifically, the paper focuses on how to adjust the step - size parameter \(\sigma^2\) and the regularization parameter \(\lambda\) according to the dimension of the target distribution to achieve optimal algorithm performance. ### Main Research Contents 1. **Extending Existing Results**: - The paper first extends the conclusions regarding Gaussian distributions in [34] and generalizes them to a broader class of finite - dimensional (sufficiently smooth) target distributions. - Through theoretical analysis, the paper shows that in some cases, appropriate selection of the regularization parameter \(\lambda\) can make MY - MALA have the same scaling properties as MALA. 2. **Optimal Scaling for Non - smooth Target Distributions**: - The paper specifically studies the application of MY - MALA to the Laplace distribution, which is a common non - smooth target distribution. - The results show that for the Laplace distribution, the optimal scaling properties of MY - MALA are different from those of MALA. Specifically, the step - size \(\sigma^2\) needs to decay as \(d^{-2/3}\), rather than \(d^{-1/3}\) for MALA. 3. **Practical Guidance**: - Based on the above theoretical results, the paper provides practical guidance for selecting the regularization parameter \(\lambda\) to optimize the performance of MY - MALA. ### Theoretical Contributions - **Comparison of Different Target Distributions**: - For continuously differentiable target distributions, the paper shows how the scaling properties of MY - MALA depend on the relative decay rates of \(\lambda\) and \(\sigma^2\). - When \(\lambda\) decays faster than \(\sigma^2\), the performance of MY - MALA is close to that of MALA; when \(\lambda\) decays more slowly, the performance of MY - MALA will be affected. - **Analysis of Non - smooth Target Distributions**: - For the Laplace distribution, the paper proves the influence of non - smoothness on optimal scaling and gives specific scaling formulas. ### Experimental Verification - The paper verifies the theoretical results through numerical experiments and shows the performance of MY - MALA under different target distributions. ### Conclusions - This paper provides a theoretical basis and practical guidance for the optimal parameter selection of MY - MALA, especially when dealing with non - smooth target distributions, providing new insights for improving algorithm efficiency. Through these studies, the paper not only extends the existing theoretical framework but also provides practical guidance for parameter selection in practical applications.