Deep Learning for Computing Convergence Rates of Markov Chains

Yanlin Qu,Jose Blanchet,Peter Glynn
2024-05-31
Abstract:Convergence rate analysis for general state-space Markov chains is fundamentally important in areas such as Markov chain Monte Carlo and algorithmic analysis (for computing explicit convergence bounds). This problem, however, is notoriously difficult because traditional analytical methods often do not generate practically useful convergence bounds for realistic Markov chains. We propose the Deep Contractive Drift Calculator (DCDC), the first general-purpose sample-based algorithm for bounding the convergence of Markov chains to stationarity in Wasserstein distance. The DCDC has two components. First, inspired by the new convergence analysis framework in (Qu <a class="link-external link-http" href="http://et.al" rel="external noopener nofollow">this http URL</a>, 2023), we introduce the Contractive Drift Equation (CDE), the solution of which leads to an explicit convergence bound. Second, we develop an efficient neural-network-based CDE solver. Equipped with these two components, DCDC solves the CDE and converts the solution into a convergence bound. We analyze the sample complexity of the algorithm and further demonstrate the effectiveness of the DCDC by generating convergence bounds for realistic Markov chains arising from stochastic processing networks as well as constant step-size stochastic optimization.
Machine Learning,Probability
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to estimate the convergence rate of Markov chains in general state spaces. Specifically, the paper focuses on how to provide practical boundary estimates for the convergence of Markov chains to stationary distributions under the Wasserstein distance. Traditional analysis methods are often unable to generate effective convergence boundaries for complex Markov chains in practical applications, which makes the problem very difficult. To this end, the authors propose the Deep Contractive Drift Calculator (DCDC), which is the first general sample - based algorithm for calculating the convergence boundaries of Markov chains under the Wasserstein distance. ### Main Contributions 1. **Proposing DCDC**: This is a new method based on deep learning, which can provide boundary estimates for the convergence rate of Markov chains in general state spaces. 2. **Sample Complexity Analysis**: The authors analyze the sample complexity of the algorithm and prove its effectiveness in practical applications. 3. **Practical Application Examples**: Through numerical experiments, the application effects of DCDC on actual Markov chains in operations research and machine learning are demonstrated. ### Key Concepts - **Contractive Drift Condition (CD)**: This is an inequality condition used to describe the convergence properties of Markov chains. - **Contractive Drift Equation (CDE)**: An equation derived from CD, and its solution can be transformed into explicit convergence boundaries. - **Physics - Informed Neural Networks (PINNs)**: A method for solving partial differential equations (PDEs), which is extended in this paper to solve CDEs. ### Mathematical Formulas - **Contractive Drift Condition**: \[ K_V(x)=E[D_f(x)V(f(x))]\leq V(x)-U(x),\quad x\in X \] where \(V, U:X\to\mathbb{R}^+\) are bounded positive functions. - **Contractive Drift Equation**: \[ K_V(x)=E[D_f(x)V(f(x))]=V(x)-U(x),\quad x\in X \] - **Wasserstein Distance**: \[ W(\mu,\nu)=\inf_{\pi\in C(\mu,\nu)}\int_{X\times X}\|x - y\|\,\pi(dx,dy) \] where \(C(\mu,\nu)\) is the set of all couplings of \(\mu\) and \(\nu\). ### Experimental Results - **Mini - batch SGD**: For the logistic regression problem with L2 regularization, DCDC can accurately solve the CDE and give an exponential convergence rate. - **Tandem Fluid Network**: When dealing with drift - dominated systems, DCDC can also find a suitable Lyapunov function and give the convergence boundary. - **Discovering Meaningful Wedge - shaped Lyapunov Functions**: For cases with neither drift nor contraction, DCDC can discover an inverted V - shaped Lyapunov function, thus solving such problems. ### Conclusion As a new deep - learning - based method, DCDC provides an effective tool for estimating the convergence rate of Markov chains in general state spaces. Through numerical experiments, the feasibility and effectiveness of this method in practical applications are proved.