Von Neumann's inequality for tensors

Stéphane Chrétien,Tianwen Wei
DOI: https://doi.org/10.48550/arXiv.1502.01616
2015-02-06
Abstract:For two matrices in $\mathbb R^{n_1\times n_2}$, the von Neumann inequality says that their scalar product is less than or equal to the scalar product of their singular spectrum. In this short note, we extend this result to real tensors and provide a complete study of the equality case.
Numerical Analysis,Spectral Theory
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to generalize the classical von Neumann inequality from matrices to tensors and study the conditions for the equality to hold. ### Problem Background For two matrices \( X \) and \( Y \in \mathbb{R}^{n_1\times n_2} \), the classical von Neumann inequality states that: \[ \langle X, Y \rangle \leq \langle \sigma(X), \sigma(Y) \rangle \] where \( \sigma(X) \) and \( \sigma(Y) \) represent the singular value vectors of matrices \( X \) and \( Y \), respectively. The equality holds if and only if \( X \) and \( Y \) have the same singular subspaces. ### Generalization to Tensors For tensors, the difficulty in generalizing the von Neumann inequality lies in how to appropriately define singular values and singular value decomposition (SVD). The author uses the higher - order singular value decomposition (HOSVD) based on Tucker decomposition to define the singular values of tensors. ### Main Results The main result of the paper is Theorem 3.1, which gives the specific form of the tensor von Neumann inequality and the conditions for the equality to hold: #### Theorem 3.1 Let \( X, Y \in \mathbb{R}^{n_1\times\cdots\times n_D} \) be two tensors. Then for all \( d = 1,\ldots, D \), we have: \[ \langle X, Y \rangle \leq \langle \sigma^{(d)}(X), \sigma^{(d)}(Y) \rangle \] The equality holds if and only if there exist orthogonal matrices \( W^{(d)} \in \mathbb{R}^{n_d\times n_d} \) (\( d = 1,\ldots, D \)) and tensors \( D(X), D(Y) \in \mathbb{R}^{n_1\times\cdots\times n_D} \) such that: \[ X = D(X)\times_1 W^{(1)}\cdots\times_D W^{(D)} \] \[ Y = D(Y)\times_1 W^{(1)}\cdots\times_D W^{(D)} \] where \( D(X) \) and \( D(Y) \) satisfy the following properties: - \( D(X) \) and \( D(Y) \) are block - diagonal tensors with the same number and size of blocks. - Let \( L \) be the number of blocks, and \( \{D_l(X)\}_{l = 1,\ldots,L} \) and \( \{D_l(Y)\}_{l = 1,\ldots,L} \) be the blocks on the diagonals of \( D(X) \) and \( D(Y) \), respectively. Then for each \( l = 1,\ldots, L \), the two blocks \( D_l(X) \) and \( D_l(Y) \) are proportional. ### Application Prospects This result is expected to be helpful in describing the sub - differentials of certain tensor functions, similar to the applications in the matrix case. These functions naturally occur in computational statistics, machine learning, and numerical analysis, especially in the convex replacement of sparsity - promoting norms as rank penalties. Through the above generalization, this paper provides an important foundation for the theory and application in tensor analysis.