Abstract:Randomized trace estimation is a popular and well studied technique that approximates the trace of a large-scale matrix $B$ by computing the average of $x^T Bx$ for many samples of a random vector $X$. Often, $B$ is symmetric positive definite (SPD) but a number of applications give rise to indefinite $B$. Most notably, this is the case for log-determinant estimation, a task that features prominently in statistical learning, for instance in maximum likelihood estimation for Gaussian process regression. The analysis of randomized trace estimates, including tail bounds, has mostly focused on the SPD case. In this work, we derive new tail bounds for randomized trace estimates applied to indefinite $B$ with Rademacher or Gaussian random vectors. These bounds significantly improve existing results for indefinite $B$, reducing the the number of required samples by a factor $n$ or even more, where $n$ is the size of $B$. Even for an SPD matrix, our work improves an existing result by Roosta-Khorasani and Ascher for Rademacher vectors. This work also analyzes the combination of randomized trace estimates with the Lanczos method for approximating the trace of $f(A)$. Particular attention is paid to the matrix logarithm, which is needed for log-determinant estimation. We improve and extend an existing result, to not only cover Rademacher but also Gaussian random vectors.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is: **How to provide more effective tail bounds for randomized trace estimates of indefinite matrices and apply them to determinant calculations**. Specifically, the paper mainly focuses on the following points: 1. **Improving existing tail bounds**: For an indefinite matrix $B$, the existing tail bounds are usually only applicable to symmetric positive definite (SPD) matrices. By introducing new techniques, such as Rademacher and Gaussian random vectors, this paper significantly improves these bounds and reduces the number of samples required. 2. **Dealing with the particularities of indefinite matrices**: When the matrix $B$ is an indefinite matrix, directly applying the existing SPD matrix methods will lead to problems. For example, when calculating the determinant, even if $A$ is an SPD matrix, $B = \log(A)$ may be an indefinite matrix. The method proposed in this paper can effectively handle such complex situations. 3. **Combining with the Lanczos method**: In order to further improve the accuracy, this paper also analyzes the effect of combining the randomized trace estimate with the Lanczos method to approximate the quadratic form $x^T f(A)x$. Especially for the matrix logarithm, which is a necessary step in calculating the determinant. ### Formula summary - **Trace estimation formula**: \[ \text{tr}_N(B) := \frac{1}{N} \sum_{i = 1}^N (X^{(i)})^T B X^{(i)} \] where $X^{(i)}$ are independent random vectors. - **Relationship between determinant and trace**: \[ \log(\det(A))=\text{tr}(\log(A)) \] - **Tail bounds**: For Gaussian random vectors: \[ P\left( \left| \text{tr}_G^N(B)-\text{tr}(B) \right| \geq \varepsilon \right) \leq 2 \exp\left( -\frac{N\varepsilon^2}{4 \|B\|_F^2 + 4\varepsilon \|B\|_2} \right) \] For Rademacher random vectors: \[ P\left( \left| \text{tr}_R^N(B)-\text{tr}(B) \right| \geq \varepsilon \right) \leq 2 \exp\left( -\frac{N\varepsilon^2}{8 \|B - D_B\|_F^2 + 8\varepsilon \|B - D_B\|_2} \right) \] ### Application background These problems are of great significance in fields such as statistical learning, maximum likelihood estimation, Gaussian process regression, and lattice quantum chromodynamics. Especially in applications that require efficient estimation of the determinant or trace of large matrices, the improved randomized trace estimation method can significantly reduce the computational cost and improve the accuracy. Through these improvements, the paper provides more powerful tools for dealing with indefinite matrices and expands the application scope of existing methods.

On randomized trace estimates for indefinite matrices with an application to determinants

Randomized Nyström approximation of non-negative self-adjoint operators

Norm and Trace Estimation with Random Rank-one Vectors

XTrace: Making the most of every sample in stochastic trace estimation

Faster randomized partial trace estimation

Extremal bounds for Gaussian trace estimation

Randomized estimation of spectral densities of large matrices made accurate

Computation of the von Neumann entropy of large matrices via trace estimators and rational Krylov methods

Krylov-aware stochastic trace estimation

A Convergence Analysis on the Iterative Trace Ratio Algorithm and Its Refinements

A distance theorem for inhomogenous random rectangular matrices

A general error analysis for randomized low-rank approximation with application to data assimilation

Optimized Tail Bounds for Random Matrix Series

Randomized low-rank approximations beyond Gaussian random matrices

Analysis of stochastic probing methods for estimating the trace of functions of sparse symmetric matrices

Improved variants of the Hutch++ algorithm for trace estimation

Improved bounds for randomized Schatten norm estimation of numerically low-rank matrices

Traces of powers of random matrices over local fields

On The Variance of Schatten $p$-Norm Estimation with Gaussian Sketching Matrices

On tail estimates for Randomized Incremental Construction