On Concentration Inequalities for Sparse Vectors
Kai Zhang
2017-01-01
Abstract:Concentration inequalities play a key role in establishing the suitability of random matrices for compressed sensing as well as dimensionality reduction through random projections. In particular, the Restricted Isometry Property (RIP) for random Gaussian matrices [1], [2] and its relationship to Johnson-Lindenstrauss lemma can be established directly from concentration inequality [3]. While concentration inequality is usually established for general vectors, the distortion (in length) to sparse vectors by random matrices is usually obtained by a combinatorial argument together with union bounds. In this work, we study concentration inequalities specific to sparse vectors when projected by random matrices from compactly supported (e.g., uniform) distributions. From this approach we naturally obtain sharper bounds compared to generic concentration inequalities. These results suggest the superiority of such distributions over the Gaussian distribution for random projection of sparse vectors. Our experiments show this improvement in concentration bound for a special case of sparse binary signals and the results are further corroborated by a higher rate of recovery of general (non-binary) sparse signals from random projections. Given a matrix A ∈ Rm×N , where each element is i.i.d. and is drawn the Normal distribution Ai,j ∼ N (0, 1 m ), any vector, x ∈ R N , with a unit length is projected to a vector, y ∈ R, by y = Ax will satisfy the inequality: Pr(‖y‖2 > 1 + δ) ≤ e δ2 2 − δ 3 3 , (1) for δ ∈ (0, 1). This shows that it is possible to represent any N D signal in m-D (m << N ) while preserving its energy (mostly) with high probability. For recovering a class of signals (e.g., sparse signals), usually we want δ to be as small as possible to establish this random projection a bijection for that class. This is possible when m or the number of measurements is sufficient with respect to properties of A (e.g., RIP). While the majority of work on this topic Gaussian and Bernoulli distributions are considered, we show that the uniform distribution offers a tighter bound via concentration inequality specifically for sparse vectors due to its compact support. As described above, if each entry of A is i.i.d. and follows the Gaussian distribution with 0-mean and a variance of 1 m , then we have: yi ∼ N (0, ‖x‖2 m ), 0 ≤ i ≤ m such that E[‖y‖] = ‖x‖. Without loss of generality, we assume ‖x‖2 = 1 for simplicity of discussion. When the matrix A is drawn from a Uniform distribution with mean 0 and variance 1 m (i.e., Ai,j ∼ U(√ 3 m , √ 3 m )) the mean and variance of ‖y‖ remain the same (due to linearity). However, the distribution of yi is no longer a Gaussian, but is a compactly supported distribution (a piecewise polynomial). The key point, driving our work, is that the support of this distribution depends on the `1 norm of x. As y can be viewed as a linear combination of columns of A weighted by elements of x, yi, as a random variable, is a linear combination of random variables Ai,1,Ai,2, · · · ,Ai,N weighted by the elements of x. Since entries of A are drawn independently the distribution of yi is the convolution of its constituent random variables. With some abuse of notation (for brevity), one can show that: p(yi) = p(x1Ai,1) ∗ p(x2Ai,2) ∗ · · · ∗ p(xNAi,N ), (2) where ∗ is convolution operator. Since p(xiAi,j) is a (scaled) box function with support [-xi √ 3 m , xi √ 3 m ], p(yi) is a univariate (centered) box spline [4] described by a vector of directions which is precisely ( √ 3/m)x. The distribution of yi is the box spline M ( √ 3/m)xT (y) whose support is limited to − √ 3/m‖x‖1 ≤ y ≤ √ 3/m‖x‖1. When x is sparse, the support of the distribution is limited by its sparsity and hence, one can obtain tighter bounds for concentration inequality. When the elements in x are binary (i.e., 0 and 1/ √ k for a k-sparse signal), this box spline simplifies to a uniform B-spline [5]. This distribution, in the statistics literature is also known as IrwinHall distribution. While we derive the concentration inequality (using the Chernoff bound) for the entire class of box splines for generic k-sparse signals, we here briefly discuss the simple case of uniform B-splines that apply to binary k-sparse signals. Figure 1 shows the of the distribution of ‖y‖ for a k-sparse binary signal (with k = 4) with the uniform (solid lines) and Gaussian (dash lines) distributions with different δ and m. This experiment shows that for large m, both distributions converge to 0 exponentially (as expected); however, with the uniform distribution has a faster decay as its decay depends on the sparsity of the underlying signal. Another experiment in 2 shows the comparison of the recovery rate of general sparse signals from different random measurements that are drawn form uniform distribution and Gaussian distribution. In this experiment, the signal dimension N is 200. We vary k from 1 to 70 and measure times m from 40 to 90, and use FISTA algorithm. For each parameter setting, we repeat 1000 times. Its obvious to notice that by using the sensing matrix that is drawn from Uniform distribution has a little higher recovery rate than from Gaussian distribution. Also, this improvement is more noticeable when m is not too big (e.g., 40 ≤ m ≤ 70), which is consistent with our result from Figure 1. Although the experiment shows a little improvement on recovery rate than Gaussian sensing matrix, due to the difficulty of deducing the close form bound for Uniform distribution on Concentration Inequality, in the future work, we will pay attention on solving this problem in another norm space and research the RIP for Uniform random matrices. Also, we will design new algorithms which can benefit from Uniform distributed random matrices. 0 100 200 300 400 500 600 700 800 90