What problem does this paper attempt to address?

The problems that this paper attempts to solve mainly focus on the evaluation of the rate at which Markov chains converge to their stationary distributions. Specifically, the paper is concerned with how to use the average - mixing time as a more optimistic and easier - to - estimate alternative for evaluating the convergence speed of Markov chains, in order to overcome some limitations of the traditional total variation mixing time ($t_{\text{mix}}$). ### Background and Motivation The traditional $t_{\text{mix}}$ is usually defined based on the worst - case initial distribution, which often leads to a pessimistic estimate of the convergence speed, and it is very difficult to infer this metric from observational data. Moreover, $t_{\text{mix}}$ is usually unknown, and even when upper bounds are given theoretically, these bounds are often conservative. Estimating $t_{\text{mix}}$ from the observational data of a single trajectory is a statistically very difficult problem, especially when the state space is large or infinite, and the required sample complexity is very high. ### Solution The paper proposes to use the **average - mixing time** ($t^\sharp_{\text{mix}}$) as a more optimistic alternative for evaluating the convergence speed of Markov chains. Specifically: 1. **Definition and Properties**: - The average - mixing time $t^\sharp_{\text{mix}}$ is defined as the minimum time required for a Markov chain starting from the stationary distribution to reach a given error threshold. - Compared with $t_{\text{mix}}$, $t^\sharp_{\text{mix}}$ can reach convergence significantly faster, especially in small state spaces. 2. **Estimation Method**: - The paper proposes a method for estimating $t^\sharp_{\text{mix}}$ from a single trajectory and proves that this method is statistically more efficient than estimating $t_{\text{mix}}$. - Especially in the uniformly ergodic setting, the paper provides specific upper bounds on sample complexity, indicating that the average - mixing time can be estimated with a sub - linear number of samples. 3. **Theoretical Results**: - The paper establishes the relationship between the average - mixing time and the relaxation time through spectral methods and geometric ergodicity assumptions. - It further explores the connection between the average - mixing time and the β - mixing coefficients, proving that the average - mixing time can be used to control the deviation of functions on Markov chains. ### Practical Significance - **Machine Learning and Statistical Inference**: The average - mixing time can be used to analyze machine learning algorithms for weakly - dependent data, providing more accurate generalization bounds and regret bounds. - **Markov Chain Monte Carlo Methods**: In MCMC methods, the average - mixing time can be used as an effective tool for diagnosing convergence, especially when the state space is large or infinite. In conclusion, by introducing and analyzing the average - mixing time, this paper provides a new and more practical framework for evaluating the convergence speed of Markov chains and solves several limitations in traditional methods.

Optimistic Estimation of Convergence in Markov Chains with the Average-Mixing Time

Empirical and Instance-Dependent Estimation of Markov Chain and Mixing Time

Estimating the Mixing Time of Ergodic Markov Chains

Adapting to Mixing Time in Stochastic Optimization with Markovian Data

Improving the convergence of Markov chains via permutations and projections

Quantitative Convergence Rates for Stochastically Monotone Markov Chains

Bounds on Mixing Time for Time-Inhomogeneous Markov Chains

Lower bounds on the rate of convergence for accept-reject-based Markov chains in Wasserstein and total variation distances

Convergence analysis of some multivariate Markov chains using stochastic monotonicity

Optimal approximating Markov chains for Bayesian inference

Mixing it up: A general framework for Markovian statistics

Drift, Minorization, and Hitting Times

RATE OF CONVERGENCE FOR MULTIPLE CHANGE-POINTS ESTIMATION OF MOVING-AVERAGE PROCESSES

Estimating the Mixing Coefficients of Geometrically Ergodic Markov Processes

Mixing of Metropolis-Adjusted Markov Chains via Couplings: The High Acceptance Regime

Convergence Rates of Markov Chains on Spaces of Partitions

The Distribution of Mixing Times in Markov Chains

Lifting Markov Chains To Mix Faster: Limits and Opportunities

Estimating Markov Chain Mixing Times: Convergence Rate Towards Equilibrium of a Stochastic Process Traffic Assignment Model

Elementary Bounds On Mixing Times for Decomposable Markov Chains

Deviation inequalities for stochastic approximation by averaging