Abstract:For two matrices in $\mathbb R^{n_1\times n_2}$, the von Neumann inequality says that their scalar product is less than or equal to the scalar product of their singular spectrum. In this short note, we extend this result to real tensors and provide a complete study of the equality case.
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to generalize the classical von Neumann inequality from matrices to tensors and study the conditions for the equality to hold.
### Problem Background
For two matrices \( X \) and \( Y \in \mathbb{R}^{n_1\times n_2} \), the classical von Neumann inequality states that:
\[
\langle X, Y \rangle \leq \langle \sigma(X), \sigma(Y) \rangle
\]
where \( \sigma(X) \) and \( \sigma(Y) \) represent the singular value vectors of matrices \( X \) and \( Y \), respectively. The equality holds if and only if \( X \) and \( Y \) have the same singular subspaces.
### Generalization to Tensors
For tensors, the difficulty in generalizing the von Neumann inequality lies in how to appropriately define singular values and singular value decomposition (SVD). The author uses the higher - order singular value decomposition (HOSVD) based on Tucker decomposition to define the singular values of tensors.
### Main Results
The main result of the paper is Theorem 3.1, which gives the specific form of the tensor von Neumann inequality and the conditions for the equality to hold:
#### Theorem 3.1
Let \( X, Y \in \mathbb{R}^{n_1\times\cdots\times n_D} \) be two tensors. Then for all \( d = 1,\ldots, D \), we have:
\[
\langle X, Y \rangle \leq \langle \sigma^{(d)}(X), \sigma^{(d)}(Y) \rangle
\]
The equality holds if and only if there exist orthogonal matrices \( W^{(d)} \in \mathbb{R}^{n_d\times n_d} \) (\( d = 1,\ldots, D \)) and tensors \( D(X), D(Y) \in \mathbb{R}^{n_1\times\cdots\times n_D} \) such that:
\[
X = D(X)\times_1 W^{(1)}\cdots\times_D W^{(D)}
\]
\[
Y = D(Y)\times_1 W^{(1)}\cdots\times_D W^{(D)}
\]
where \( D(X) \) and \( D(Y) \) satisfy the following properties:
- \( D(X) \) and \( D(Y) \) are block - diagonal tensors with the same number and size of blocks.
- Let \( L \) be the number of blocks, and \( \{D_l(X)\}_{l = 1,\ldots,L} \) and \( \{D_l(Y)\}_{l = 1,\ldots,L} \) be the blocks on the diagonals of \( D(X) \) and \( D(Y) \), respectively. Then for each \( l = 1,\ldots, L \), the two blocks \( D_l(X) \) and \( D_l(Y) \) are proportional.
### Application Prospects
This result is expected to be helpful in describing the sub - differentials of certain tensor functions, similar to the applications in the matrix case. These functions naturally occur in computational statistics, machine learning, and numerical analysis, especially in the convex replacement of sparsity - promoting norms as rank penalties.
Through the above generalization, this paper provides an important foundation for the theory and application in tensor analysis.