Entrywise tensor-train approximation of large tensors via random embeddings

Stanislav Budzinskiy
2024-08-21
Abstract:The theory of low-rank tensor-train approximation is well-understood when the approximation error is measured in the Frobenius norm. The entrywise maximum norm is equally important but is significantly weaker for large tensors, making the estimates obtained via the Frobenius norm and norm equivalence pessimistic or even meaningless. In this article, we derive a direct estimate of the entrywise approximation error that is applicable in some of these cases. The estimate is given in terms of the higher-order generalization of the matrix factorization norm, and its proof is based on the tensor-structured Hanson-Wright inequality. The theoretical results are accompanied with numerical experiments carried out with the method of alternating projections.
Numerical Analysis,Probability
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the error estimation problem of Tensor - Train (TT) approximation under the element - wise maximum norm of large tensors. Specifically, the existing low - rank tensor - train approximation theory mainly focuses on error estimation under the Frobenius norm, and the Frobenius norm is too pessimistic or even meaningless when dealing with large tensors. Therefore, the author aims to derive a direct element - wise error estimate, which is applicable in certain situations and is given in the form of a high - order matrix factorization norm. The proof is based on the tensor - structured Hanson - Wright inequality. In addition, the paper also conducts numerical experiments by the alternating projection method to verify the theoretical results. ### Formulas and Notation Explanation - **Frobenius norm**: \(\|A\|_F=\sqrt{\sum_{i_1,\ldots,i_d}|A(i_1,\ldots,i_d)|^2}\) - **Element - wise maximum norm**: \(\|A\|_{\text{max}}=\max_{i_1,\ldots,i_d}|A(i_1,\ldots,i_d)|\) - **Matrix factorization norm**: \(\gamma_2(A)=\inf\{\|X\|_{2,\infty}\|Y\|_{2,\infty}:X,Y\text{ such that }A = XY^{\top}\}\) - **Tensor - train decomposition**: \(A(i_1,\ldots,i_d)=\sum_{\alpha_1 = 1}^{r_1}\cdots\sum_{\alpha_{d - 1}=1}^{r_{d - 1}}G_1(1,i_1,\alpha_1)G_2(\alpha_1,i_2,\alpha_2)\cdots G_d(\alpha_{d - 1},i_d,1)\) ### Main Contributions - **Theorem 1.6**: For any \(\epsilon\in(0,1)\) and \(n_1,\ldots,n_d\in\mathbb{N}\), consider \[ r=\left\lceil c_d\frac{\log\left(2e\prod_{s = 1}^d n_s\right)}{\epsilon^2}\right\rceil\in\mathbb{N}, \] where \(c_d>0\) is an absolute constant depending only on \(d\), and \(e\) is the base of the natural logarithm. For each \(A\in\mathbb{R}^{n_1\times\cdots\times n_d}\), there exists \(B\in\mathbb{R}^{n_1\times\cdots\times n_d}\) such that \(\text{rank}_{\text{TT}}(B)\preceq r\) and \[ \|A - B\|_{\text{max}}\leq\epsilon\cdot\gamma_{\text{TT}}^F(A). \] ### Technical Details - **Tensor - structured Hanson - Wright inequality**: Used to estimate the moments of quadratic forms under sub - Gaussian random variables. - **Partial trace**: Defined as \(C\langle\Omega'\rangle\in\mathbb{R}^{m\times2}((\Omega\setminus\Omega')\oplus(\Omega\setminus\Omega'))\), where \[ C\langle\Omega'\rangle_{i\dot{+}j}=\sum_{\ell\in I_m(\Omega')}