Arvind K. Saibaba,Agnieszka Międlar
Abstract:This paper expands the analysis of randomized low-rank approximation beyond the Gaussian distribution to four classes of random matrices: (1) independent sub-Gaussian entries, (2) independent sub-Gaussian columns, (3) independent bounded columns, and (4) independent columns with bounded second moment. Using a novel interpretation of the low-rank approximation error involving sample covariance matrices, we provide insight into the requirements of a \textit{good random matrix} for the purpose of randomized low-rank approximation. Although our bounds involve unspecified absolute constants (a consequence of the underlying non-asymptotic theory of random matrices), they allow for qualitative comparisons across distributions. The analysis offers some details on the minimal number of samples (the number of columns $\ell$ of the random matrix $\boldsymbol\Omega$) and the error in the resulting low-rank approximation. We illustrate our analysis in the context of the randomized subspace iteration method as a representative algorithm for low-rank approximation, however, all the results are broadly applicable to other low-rank approximation techniques. We conclude our discussion with numerical examples using both synthetic and real-world test matrices.
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to expand the scope of analysis in random low - rank approximation beyond Gaussian random matrices to cover four types of random matrices: (1) independent sub - Gaussian entries, (2) independent sub - Gaussian columns, (3) independent bounded columns, and (4) independent columns with bounded second - order moments. By introducing a novel interpretation of the low - rank approximation error, which involves the sample covariance matrix, the authors provide insights into the requirements for well - constructed random matrices. Although the bounds in the paper involve unspecified absolute constants (as a result of the underlying non - asymptotic random matrix theory), they allow for qualitative comparisons between different distributions. In addition, the paper also explores the minimum number of samples required for each type of random matrix (i.e., the number of columns $\ell$ of the random matrix $\Omega$) and the resulting error of the low - rank approximation. The paper takes the random subspace iteration method as an example to demonstrate its analysis method, but all results are widely applicable to other low - rank approximation techniques, and these analysis results are verified by numerical experiments using synthetic and real - world test matrices.
### Main contributions of the paper:
1. **Novel error interpretation**: Provides a novel interpretation of the low - rank approximation error based on the sample covariance matrix, which helps to understand the requirements of random matrices.
2. **Sufficient conditions**: Provides sufficient conditions for four types of random matrices:
- Independent sub - Gaussian entries (Theorem 3.3)
- Independent sub - Gaussian columns (Theorem 3.5)
- Independent bounded columns (Theorem 3.7)
- Independent columns with bounded second - order moments (Theorem 3.9)
3. **Wide applicability**: Takes the random subspace iteration method as an example, but the analysis method is also applicable to other low - rank approximation algorithms, such as the Nyström method, the block Krylov method, etc.
4. **Comparison indicators**: Proposes two indicators for comparison: the minimum number of samples required (i.e., the number of columns $\ell$ of the random matrix $\Omega$) and the error of the low - rank approximation.
5. **Applicability to large - scale problems**: By using the concept of stable rank, the bounds do not explicitly depend on the size of the matrix, which is suitable for large - scale problems.
6. **Theoretical extension**: Provides a theoretical basis for many random matrix distributions, unifies the existing individual distribution results, and greatly expands the types of distributions available for low - rank calculations.
7. **Numerical experiments**: Provides numerical experiments for a variety of test matrices and random matrix distributions, many of which have not been explored in the context of low - rank approximation.
### Main results:
- **Theorem 3.3**: For a random matrix with independent sub - Gaussian entries, predicts that the required number of samples $\ell$ is $O(k)$.
- **Theorem 3.5**: For a random matrix with independent sub - Gaussian columns, predicts that the required number of samples $\ell$ is $O(k)$.
- **Theorem 3.7**: For a random matrix with independent bounded columns, predicts that the required number of samples $\ell$ is $O(k \log k)$.
- **Theorem 3.9**: For a random matrix with independent columns with bounded second - order moments, further expands the types of distributions that can be analyzed.
These results not only provide theoretical support for low - rank approximation, but also provide guidance for parameter selection in practical applications.