Analyzing Tensor Power Method Dynamics in Overcomplete Regime

Anima Anandkumar,Rong Ge,Majid Janzamin
DOI: https://doi.org/10.48550/arXiv.1411.1488
2015-09-15
Abstract:We present a novel analysis of the dynamics of tensor power iterations in the overcomplete regime where the tensor CP rank is larger than the input dimension. Finding the CP decomposition of an overcomplete tensor is NP-hard in general. We consider the case where the tensor components are randomly drawn, and show that the simple power iteration recovers the components with bounded error under mild initialization conditions. We apply our analysis to unsupervised learning of latent variable models, such as multi-view mixture models and spherical Gaussian mixtures. Given the third order moment tensor, we learn the parameters using tensor power iterations. We prove it can correctly learn the model parameters when the number of hidden components $k$ is much larger than the data dimension $d$, up to $k = o(d^{1.5})$. We initialize the power iterations with data samples and prove its success under mild conditions on the signal-to-noise ratio of the samples. Our analysis significantly expands the class of latent variable models where spectral methods are applicable. Our analysis also deals with noise in the input tensor leading to sample complexity result in the application to learning latent variable models.
Machine Learning
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the dynamic analysis of tensor power iteration in the overcomplete case. Specifically, when the CP rank of a tensor is greater than the input dimension, how to recover the rank - one components of the tensor through a simple power iteration method. In general, finding the CP decomposition of an overcomplete tensor is an NP - hard problem. However, the author considered the case where tensor components are randomly sampled and proved that under mild initialization conditions, simple power iteration can recover these components with bounded error. ### Main Contributions 1. **Dynamic Analysis**: The author conducted a detailed analysis of the dynamics of third - order tensor power iteration in the overcomplete case. They assumed that tensor components are randomly sampled from the unit sphere and proved that under mild initialization conditions, power iteration can quickly converge to a local optimal solution close to the true components. 2. **Applications**: This analysis was applied to unsupervised learning of latent variable models, such as multi - view mixture models and spherical Gaussian mixture models. Given a third - order moment tensor, the model parameters can be learned through tensor power iteration. 3. **Noise Handling**: The author also analyzed the case where there is noise in the input tensor and gave the results of sample complexity. ### Specific Results - **Theorem 1**: For a tensor \(T\) of rank \(k\) in the form of \(T=\sum_{j\in [k]}\lambda_j a_j\otimes a_j\otimes a_j\), assume that \(k = o(d^{1.5})\), and the initial vector \(x^{(1)}\) satisfies the correlation condition \(|\langle x^{(1)}, a_j\rangle|\geq d^\beta\sqrt{\frac{k}{d}}\) with some true component \(a_j\), where \(\beta > (\log d)^{-c}\). After \(N = \Theta(\log\log d)\) iterations, the vector output by tensor power iteration has a constant - level correlation with the true component \(a_j\) with high probability, that is, \(|\langle x^{(N + 1)}, a_j\rangle|\geq 1-\gamma\). - **Theorem 2**: Given an exact third - order tensor, tensor power iteration can converge to a vector that has a constant - level correlation with the true mean vector \(a_j\). - **Theorem 3**: By jointly iterating the recovered vectors, the entire factor matrix \(A\) can be consistently recovered. - **Theorem 4**: Given an empirical tensor, the above guarantees still hold, but the recovery error will be affected by noise. ### Related Work - **Tensor Decomposition for Learning Latent Variable Models**: Anandkumar et al. studied the theoretical and practical aspects of tensor decomposition in learning latent variable models in other works. - **Learning Gaussian Mixture Models**: In recent years, many studies have improved the methods for learning Gaussian mixture models, which can be divided into two major categories: distance - based methods and spectral methods. ### Conclusion This paper extends the class of latent variable models applicable to spectral methods by analyzing the dynamics of tensor power iteration in the overcomplete case and provides consistent recovery guarantees in the presence of noise. This provides new tools and methods for processing high - dimensional data and complex latent variable models.