Alternating Mixing Stochastic Gradient Descent for Large-scale Matrix Factorization

Zhenhong Chen,Yanyan Lan,Jiafeng Guo,Jun Xu,Xueqi Cheng
2014-01-01
Abstract:This paper is concerned with distributed stochastic gradient descent (SGD) for large scale matrix factorization (MF), which seeks to approximate a data matrix V by the product of two low rank matrices W and H . Among many distributed methods, iterative parameter mixing (IPM) has been proven to be one of the most resource-efficient and effective techniques. However, some recent empirical studies showed that IPM fails in MF. The main reason lies in the coupling of W and H , which makes the direct mixing strategy no longer correct in MF. To address the problem, we propose an alternating mixing stochastic gradient descent algorithm for the MF problem, namely AM-SGD. In the new algorithm, matrices W and H are updated alternatively with parameter mixing strategy being applied in both procedures. In this way, the correctness of mixing is guaranteed and the effectiveness of IPM is preserved. Theoretical analysis and experiment results demonstrated that AM-SGD is more efficient and effective compared to state-of-the-art distributed SGD algorithms for MF.
What problem does this paper attempt to address?