Concentration without Independence via Information Measures

Amedeo Roberto Esposito,Marco Mondelli
2023-10-30
Abstract:We propose a novel approach to concentration for non-independent random variables. The main idea is to ``pretend'' that the random variables are independent and pay a multiplicative price measuring how far they are from actually being independent. This price is encapsulated in the Hellinger integral between the joint and the product of the marginals, which is then upper bounded leveraging tensorisation properties. Our bounds represent a natural generalisation of concentration inequalities in the presence of dependence: we recover exactly the classical bounds (McDiarmid's inequality) when the random variables are independent. Furthermore, in a ``large deviations'' regime, we obtain the same decay in the probability as for the independent case, even when the random variables display non-trivial dependencies. To show this, we consider a number of applications of interest. First, we provide a bound for Markov chains with finite state space. Then, we consider the Simple Symmetric Random Walk, which is a non-contracting Markov chain, and a non-Markovian setting in which the stochastic process depends on its entire past. To conclude, we propose an application to Markov Chain Monte Carlo methods, where our approach leads to an improved lower bound on the minimum burn-in period required to reach a certain accuracy. In all of these settings, we provide a regime of parameters in which our bound fares better than what the state of the art can provide.
Information Theory,Probability
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to achieve concentration inequalities in the case of non - independent random variables. Traditionally, concentration inequalities such as the McDiarmid inequality are very effective in dealing with independent random variables, but perform poorly when dealing with dependent random variables. This paper proposes a new method by "pretending" that the random variables are independent and introducing a multiplicative cost to measure the degree of deviation of these random variables from actual independence. This cost is quantified by the Hellinger integral, that is, the Hellinger integral between the joint distribution and the product of the marginal distributions. The author uses the tensorial property to estimate the upper bound of this integral, thereby obtaining a concentration inequality applicable to dependent random variables. Specifically, the main contributions of the paper are as follows: 1. **Propose a new method**: The author proposes a new method to deal with the concentration problem of non - independent random variables, and this method can also obtain the same probability decay rate as in the independent case in the "large deviations" situation. 2. **Wide application**: This method is not only applicable to Markov chains with finite state spaces, but also applicable to the Simple Symmetric Random Walk (SSRW), non - Markov processes and Markov Chain Monte Carlo (MCMC) methods. In these application scenarios, the author provides parameter ranges so that their method is superior to the existing techniques. 3. **Theoretical basis**: The author uses information - theoretic tools such as the Hellinger integral and Rényi divergence, and estimates the upper bounds of these integrals through the tensorial property, thereby deriving new concentration inequalities. 4. **Improve existing results**: Under certain parameter settings, the author's method can provide tighter bounds than existing methods, especially when dealing with SSRW, the author's method can capture the correct relationship between the average distance \(t\), the number of variables \(n\) and the probability decay in the concentration bound. In summary, the main objective of this paper is to extend the traditional concentration inequalities so that they are applicable to the case of dependent random variables, and verify their effectiveness in multiple practical application scenarios.