SGD_Tucker: A Novel Stochastic Optimization Strategy for Scalable Parallel Sparse Tucker Decomposition
Hao Li,Zixuan Li,Kenli Li,Jan S. Rellermeyer,Lydia Chen,Keqin Li,Lydia Y. Chen
DOI: https://doi.org/10.1109/tpds.2020.3047460
IF: 5.3
2021-01-01
IEEE Transactions on Parallel and Distributed Systems
Abstract:Sparse Tucker Decomposition (STD) algorithms learn a core tensor and a group of factor matrices to obtain an optimal low-rank representation feature for the High-Order, High-Dimension, and Sparse Tensor (HOHDST). However, existing STD algorithms face the problem of intermediate variables explosion which results from the fact that the formation of those variables, i.e., matrices Khatri-Rao product, Kronecker product, and matrix-matrix multiplication, follows the whole elements in sparse tensor. The above problems prevent deep fusion of efficient computation and big data platforms. To overcome the bottleneck, a novel stochastic optimization strategy (SGD$\_$_Tucker) is proposed for STD which can automatically divide the high-dimension intermediate variables into small batches of intermediate matrices. Specifically, SGD$\_$_Tucker only follows the randomly selected small samples rather than the whole elements, while maintaining the overall accuracy and convergence rate. In practice, SGD$\_$_Tucker features the two distinct advancements over the state of the art. First, SGD$\_$_Tucker can prune the communication overhead for the core tensor in distributed settings. Second, the low data-dependence of SGD$\_$_Tucker enables fine-grained parallelization, which makes SGD$\_$_Tucker obtaining lower computational overheads with the same accuracy. Experimental results show that SGD$\_$_Tucker runs at least 2$X$X faster than the state of the art.
computer science, theory & methods,engineering, electrical & electronic