Multi-Dimensional Image Recovery Via Fully-Connected Tensor Network Decomposition under the Learnable Transforms

Lyu Cheng-Yao,Zhao Xi-Le,Li Ben-Zheng,Zhang Hao,Huang Ting-Zhu
DOI: https://doi.org/10.1007/s10915-022-02009-0
2022-01-01
Journal of Scientific Computing
Abstract:Multi-dimensional image recovery from incomplete data is a fundamental problem in data processing. Due to its advantage of capturing the correlations between any modes of the multi-dimensional image, i.e., the target tensor, the fully-connected tensor network (FCTN) decomposition has recently shown promising performance on multi-dimensional image recovery. However, FCTN decomposition suffers from computational deficiency, especially for large-scale multi-dimensional images. To address this deficiency, we propose a learnable transform-based FCTN model (termed as T-FCTN), which enjoys the remarkable advantage of FCTN decomposition with cheap computational cost. More concretely, we learn the semi-orthogonal transforms along each mode of the target tensor to project the large-scale tensor $${\mathcal {X}}$$ $$\in $$ $${\mathbb {R}}^{I\times {I}\times {\cdots }\times {I}}$$ into a small-scale essential tensor $${\mathcal {E}}$$ $$\in $$ $${\mathbb {R}}^{r\times {r}\times {\cdots }\times {r}}$$ , and then apply FCTN decomposition on the small-scale essential tensor. To tackle the proposed model, we develop an efficient proximal alternating minimization (PAM)-based algorithm with theoretical convergence guarantee. Moreover, the computational complexity of PAM for T-FCTN is $${\mathcal {O}}{(N\sum _{k=2}^N{r^k}{R^{k(N-k)+k-1}}}+{N{r^{N-1}}R^{2(N-1)}+N{R}^{3(N-1)}+N{\sum _{k=1}^N{{r^k}{I}^{N-k+1}}})}$$ at each iteration, which is significantly lower than $${\mathcal {O}}{(N\sum _{k=2}^N{I^k}{R^{k(N-k)+k-1}}}+N{I^{N-1}}R^{2(N-1)}+{N{R}^{3(N-1)})}$$ of PAM for FCTN when $$r\ll I$$ . Extensive numerical experiments on color videos and light field images illustrate the superiority of the proposed method over other state-of-the-art methods in terms of quality metrics, visual quality, and running time.
What problem does this paper attempt to address?