An Efficient Parallelization Approach for Large-Scale Sparse Non-Negative Matrix Factorization Using Kullback-Leibler Divergence on Multi-GPU.
Hao Li,Kenli Li,Jiwu Peng,Junyan Hu,Keqin Li
DOI: https://doi.org/10.1109/ispa/iucc.2017.00085
2017-01-01
Abstract:Matrix factorization (MF), as one of the most accurate and scalable approaches in dimension reduction techniques, has become popular in the collaborative filtering (CF) recommender systems, social network and graph communities. Currently, Kullback-Lerbler Non-negative Matrix Factorization (KL-NMF) is one of the most famous approaches for MF, due to its representative non-negativity feature of the CF model. However, it is non-trivial to obtain high performance KL-NMF on Graphic Processing Units (GPU) for large-scale problems, due to the redundant large-scale intermediate data, frequent matrices manipulation, and access of sparse and irregular entries characteristic of KL-NMF. In this work, we propose single-thread-based KL-NMF, which depends on the involved feature tuples multiplication and summation, and then, we present L-2 norm regularized single-thread-based KL-NMF. On that basis, a novel CUDA parallelization KL-NMF approach (CuKL-NMF) is presented for GPU computing. Furthermore, to process large-scale CF data sets and make advantages of GPU computation power, we propose multi-GPU CuKL-NMF (MCuKL-NMF). Compared with state-of-the-art parallel algorithms, e.g., CUMF, CCD++, MCuKL-NMF obtains the highest performance.