Efficiency of First-Order Methods for Low-Rank Tensor Recovery with the Tensor Nuclear Norm Under Strict Complementarity

Dan Garber,Atara Kaplan
2023-08-03
Abstract:We consider convex relaxations for recovering low-rank tensors based on constrained minimization over a ball induced by the tensor nuclear norm, recently introduced in \cite{tensor_tSVD}. We build on a recent line of results that considered convex relaxations for the recovery of low-rank matrices and established that under a strict complementarity condition (SC), both the convergence rate and per-iteration runtime of standard gradient methods may improve dramatically. We develop the appropriate strict complementarity condition for the tensor nuclear norm ball and obtain the following main results under this condition: 1. When the objective to minimize is of the form $f(\mX)=g(\mA\mX)+\langle{\mC,\mX}\rangle$ , where $g$ is strongly convex and $\mA$ is a linear map (e.g., least squares), a quadratic growth bound holds, which implies linear convergence rates for standard projected gradient methods, despite the fact that $f$ need not be strongly convex. 2. For a smooth objective function, when initialized in certain proximity of an optimal solution which satisfies SC, standard projected gradient methods only require SVD computations (for projecting onto the tensor nuclear norm ball) of rank that matches the tubal rank of the optimal solution. In particular, when the tubal rank is constant, this implies nearly linear (in the size of the tensor) runtime per iteration, as opposed to super linear without further assumptions. 3. For a nonsmooth objective function which admits a popular smooth saddle-point formulation, we derive similar results to the latter for the well known extragradient method. An additional contribution which may be of independent interest, is the rigorous extension of many basic results regarding tensors of arbitrary order, which were previously obtained only for third-order tensors.
Optimization and Control,Machine Learning
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: the efficiency and convergence of first - order methods in the low - rank tensor recovery problem under the strict complementarity condition. Specifically, the author focuses on the performance of the convex relaxation method based on the Tensor Nuclear Norm (TNN) when recovering low - rank high - order tensors. ### Problem Background Low - rank models are very important in the fields of multi - dimensional array statistics, machine learning, etc. From a computational perspective, low - rank means a more concise representation, allowing for efficient storage and running implementation, which is crucial for high - dimensional settings. From a statistical perspective, low - rank usually means being able to recover multi - dimensional arrays from noise or partial information under appropriate assumptions. In recent years, research on low - rank matrices has been very rich, but research on higher - order low - rank tensors is relatively scarce. ### Research Motivation Many existing methods rely on the Tucker rank or CP rank of tensors, but these definitions have the problem of high computational complexity. For example, determining the CP rank of a tensor is an NP - hard problem. Therefore, researchers introduced the t - product and its corresponding t - SVD decomposition, and proposed the tensor nuclear norm (TNN) as a convex relaxation of low - rank tensors. ### Main Problems 1. **Strict Complementarity (SC)**: How to define the SC condition for tensor optimization problems and prove its validity. 2. **Convergence and Efficiency of First - Order Methods**: Under the condition of satisfying SC, whether the standard gradient method can obtain a linear convergence rate, and whether the computational complexity of each iteration step can be significantly reduced. 3. **Simplification of Projection Computation**: In the region close to the optimal solution, whether low - rank SVD can be used instead of full - rank SVD for projection computation, thereby greatly reducing the computation time. ### Solutions The author solved the above problems in the following ways: 1. **Deriving the SC Condition**: For the differentiable objective function \( f(X) \), the SC condition applicable to tensor optimization problems was derived and proved to be valid in general cases. 2. **Linear Convergence Rate**: For the objective function of the form \( f(X)=g(AX)+\langle C, X\rangle \), where \( g(\cdot) \) is a strongly convex function and \( A \) is a linear mapping, under the condition of satisfying SC, it was proved that the problem satisfies the quadratic growth bound, thus ensuring the linear convergence rate of the standard gradient method. 3. **Low - Rank SVD Projection**: In the region close to the optimal solution, it was proved that the tubal rank of the projected gradient mapping does not exceed the tubal rank of the optimal solution, so that low - rank SVD can be used for projection computation, greatly reducing the time complexity of each iteration step. ### Formula Display - Tensor Nuclear Norm (TNN): \[ \|X\|_*=\sum_{i = 1}^{n_1}\sigma_i(X) \] where \( \sigma_i(X) \) represents the \( i\) - th largest singular value of tensor \( X \). - Tubal Rank of the Projected Gradient Mapping: \[ \text{rank}_{\text{tubal}}(\text{proj}(X))\leq r^* \] where \( r^* \) is the tubal rank of the optimal solution. - Quadratic Growth Bound: \[ f(X)-f(X^*)\geq\frac{\gamma}{2}\|X - X^*\|^2 \] where \( \gamma>0 \) is the quadratic growth parameter. ### Summary This paper proves the efficiency and convergence of first - order methods in the low - rank tensor recovery problem by introducing the strict complementarity condition, and proposes a method to simplify projection computation, making the solution of large - scale tensor optimization problems more feasible.