Classifying the Concentration of the Boolean Cube for Dependent Distributions

Jonathan Root,Mark Kon
2024-08-06
Abstract:A metric probability space $(\Omega,d)$ obeys the ${\it concentration\; of\; measure\; phenomenon}$ if subsets of measure $1/2$ enlarge to subsets of measure close to 1 as a transition parameter $\epsilon$ approaches a limit. In this paper we consider the concentration of the space itself, namely the concentration of the metric $d(x,y)$ for a fixed $y\in \Omega$. For any $y\in \Omega$, the concentration of $d(x,y)$ is guaranteed for product distributions in high dimensions $n$, as $d(x,y)$ is a Lipschitz function in $x$. In fact, in the product setting, the rate at which the metric concentrates is of the same order in $n$ for any fixed $y\in \Omega$. The same thing, however, cannot be said for certain dependent (non-product) distributions. For the Boolean cube $I_n$ (a widely analyzed simple model), we show that, for any dependent distribution, the rate of concentration of the Hamming distance $d_H(x,y)$, for a fixed $y$, depends on the choice of $y\in I_n$, and on the variance of the conditional distributions $\mu(x_k \mid x_1,\dots, x_{k-1})$, $2\leq k\leq n$. We give an inductive bound which holds for all probability distributions on the Boolean cube, and characterize the quality of concentration by a certain positive (negative) correlation condition. Our method of proof is advantageous in that it is both simple and comprehensive. We consider uniform bounding techniques when the variance of the conditional distributions is negligible, and show how this basic technique applies to the concentration of the entire class of Lipschitz functions on the Boolean cube.
Probability
What problem does this paper attempt to address?
The problem that this paper attempts to solve is, in the Boolean Cube, for dependent distributions (non - independent distributions), the specific manifestation forms of the Concentration of Measure Phenomenon and its influencing factors. Specifically, the paper focuses on how the concentration property of the Hamming distance \(d_H(x, y)\) changes with the choice of \(y\) and the variance of the conditional distribution, given a fixed point \(y\). ### Main research questions: 1. **Concentration property of the Hamming distance**: - The paper explores the concentration property of the Hamming distance \(d_H(x, y)\) for any dependent distribution \(\mu\) on the Boolean cube \(\{0, 1\}^n\). - In particular, it studies the degree of concentration of \(d_H(x, y)\) around its mean, that is, the upper bound of \(\int_{\{0, 1\}^n} e^{t(d_H(x, y)-E_{\mu}d_H(x, y))}d\mu(x)\). 2. **Influence of the dependent distribution**: - The paper analyzes how the variance of the conditional distribution \(\mu(x_k|x_1,\ldots,x_{k - 1})\) affects the concentration property of the Hamming distance. - It proposes an inductive method to deal with the dependent distribution and gives an upper bound of an error term, which reflects the influence of dependence on the concentration property. 3. **Positive correlation condition**: - The paper introduces a positive correlation condition to describe how the choice of \(y\) affects the concentration property. - Specifically, the positive correlation condition refers to the positive correlation between \(a_{k - 1}(x', y', t)\) and \(\mu(x_k = y_k|x')\), which determines the sign of the error term. 4. **Simplified upper bound in the case of small variance**: - For the case where the variance of the conditional distribution is small, the paper provides a simplified upper bound, which has a more concise form and is suitable for approximately independent distributions. - The form of this simplified upper bound is \(\int_{\{0, 1\}^n} e^{t(d_H(x, y)-E_{\mu}d_H(x, y))}d\mu(x)\leq e^{t^2/2}\prod_{j = 2}^n(b_j+e^{t^2/2})\), where \(b_j=e^{-t\mu(j)(x_j\neq y_j)}(e^t - 1)\|\epsilon_{x', y_j}\|_{\infty}\). ### Research methods: - **Inductive method**: Deal with the dependent distribution by induction and gradually derive the upper bound of the error term. - **Positive correlation condition**: Introduce the positive correlation condition to describe the influence of the choice of \(y\) on the concentration property. - **Small variance case**: Provide a simplified upper bound for the case where the variance of the conditional distribution is small. ### Main conclusions: - The paper proves that on the Boolean cube, for any dependent distribution, the concentration property of the Hamming distance depends on the choice of \(y\) and the variance of the conditional distribution. - By introducing the positive correlation condition, the paper reveals how the choice of \(y\) affects the concentration property. - For the case where the variance of the conditional distribution is small, the paper provides a simplified upper bound, which has a concise form and is suitable for approximately independent distributions. In summary, through detailed mathematical analysis and inductive methods, this paper deeply explores the concentration property of the Hamming distance of the dependent distribution on the Boolean cube, providing a new perspective for understanding the concentration phenomenon under complex dependent relationships.