Abstract:A classical approach for approximating expectations of functions w.r.t. partially known distributions is to compute the average of function values along a trajectory of a Metropolis-Hastings (MH) Markov chain. A key part in the MH algorithm is a suitable acceptance/rejection of a proposed state, which ensures the correct stationary distribution of the resulting Markov chain. However, the rejection of proposals causes highly correlated samples. In particular, when a state is rejected it is not taken any further into account. In contrast to that we consider a MH importance sampling estimator which explicitly incorporates all proposed states generated by the MH algorithm. The estimator satisfies a strong law of large numbers as well as a central limit theorem, and, in addition to that, we provide an explicit mean squared error bound. Remarkably, the asymptotic variance of the MH importance sampling estimator does not involve any correlation term in contrast to its classical counterpart. Moreover, although the analyzed estimator uses the same amount of information as the classical MH estimator, it can outperform the latter in scenarios of moderate dimensions as indicated by numerical experiments.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is how to estimate the expected value of a function more effectively when the probability measure is partially known. Specifically, in the process of generating Markov chains, the traditional Metropolis - Hastings (MH) algorithm will reject some proposed states, which leads to a high correlation among samples and thus reduces the statistical efficiency. The paper proposes an improved method, namely the MH importance sampling estimator, which not only takes into account the accepted states but also explicitly utilizes all the proposed states generated by the MH algorithm to reduce the correlation between samples and improve the estimation efficiency. ### Main Contributions 1. **MH Importance Sampling Estimator**: The paper introduces a new estimator \( A_n(f) \), which utilizes all the proposed states \( Y_k \), not just the accepted states \( X_k \). The form of the estimator is as follows: \[ A_n(f) = \frac{\sum_{k = 1}^n \bar{\rho}(X_k, Y_k) f(Y_k)}{\sum_{k = 1}^n \bar{\rho}(X_k, Y_k)} \] where \(\bar{\rho}(x, y)=\frac{d\mu}{dP(x, \cdot)}(y)\) is an importance weight used to correct the difference between the proposed distribution and the target distribution. 2. **Theoretical Results**: - **Strong Law of Large Numbers (SLLN)**: For any \( f\in L^1(\mu) \), as \( n\rightarrow\infty \), \( A_n(f) \) converges almost surely to \( E_\mu(f) \). - **Central Limit Theorem (CLT)**: For any \( f\in L^2(\mu) \), as \( n\rightarrow\infty \), the standardized error \( \sqrt{n}(A_n(f)-E_\mu(f)) \) converges to a normal distribution \( N(0, \sigma_A^2(f)) \) with a mean of 0 and a variance of \( \sigma_A^2(f) \). - **Mean Squared Error Bound**: An explicit upper bound for the mean squared error of \( A_n(f) \) is provided. 3. **Numerical Experiments**: Through numerical experiments, it is shown that in the case of medium - dimensionality, the newly proposed estimator \( A_n(f) \) can outperform the traditional MH estimator \( S_n(f) \). ### Related Work - **Importance Sampling**: Importance sampling is a widely - used expected - value - estimating technique and has received considerable attention recently in both theory and application. - **Improvements of the MH Algorithm**: Other researchers have also proposed methods that combine the MH algorithm and importance sampling, such as using the MH algorithm to approximate the minimum - variance importance distribution or approaching the target distribution by mixing importance distributions. ### Conclusion By introducing the MH importance sampling estimator, the paper provides a method for more efficiently estimating the expected value of a function when the probability measure is partially known. This method is not only strictly proven theoretically but also shows superior performance in numerical experiments.

On a Metropolis-Hastings importance sampling estimator

On importance sampling and independent Metropolis-Hastings with an unbounded weight function

On the existence of moments for high dimensional importance sampling

Gradient Estimation via Differentiable Metropolis-Hastings

On Disturbance State-Space Models and the Particle Marginal Metropolis-Hastings Sampler

Robust random walk-like Metropolis-Hastings algorithms for concentrating posteriors

A note on the Metropolis-Hastings acceptance probabilities for mixture spaces

Statistical guarantees for stochastic Metropolis-Hastings

A weighted Discrepancy Bound of quasi-Monte Carlo Importance Sampling

A New Reliability Method Combining Adaptive Kriging and Active Variance Reduction Using Multiple Importance Sampling

Nonasymptotic Bounds for Suboptimal Importance Sampling

Quasi-Monte Carlo for unbounded integrands with importance sampling

Iterative Importance Sampling Algorithms for Parameter Estimation

Achieving High Convergence Rates by Quasi-Monte Carlo and Importance Sampling for Unbounded Integrands

Metropolis-Hastings algorithms with autoregressive proposals, and a few examples

Adaptive importance sampling via minimization of estimators of cross-entropy, mean square, and inefficiency constant

Score-Based Metropolis-Hastings Algorithms

Particle Efficient Importance Sampling

Quasi-Newton particle Metropolis-Hastings

Using Autodiff to Estimate Posterior Moments, Marginals and Samples

A large deviation principle for the empirical measures of Metropolis-Hastings chains