A Random Algorithm for Low-Rank Decomposition of Large-Scale Matrices with Missing Entries
Yiguang Liu,Yinjie Lei,Chunguang Li,Wenzheng Xu,Yifei Pu
DOI: https://doi.org/10.1109/tip.2015.2458176
IF: 10.6
2015-01-01
IEEE Transactions on Image Processing
Abstract:A random submatrix method (RSM) is proposed to calculate the low-rank decomposition Ûmxr V̂nxrT (r <; m, n) of the matrix Y ∈ Rmxn (assuming m > n generally) with known entry percentage 0 <; p ≤ 1. RSM is very fast as only O(mr2pr) or O(n3p3r) floating-point operations (flops) are required, compared favorably with O(mnr + r2(m + n)) flops required by the state-of-the-art algorithms. Meanwhile, RSM has the advantage of a small memory requirement as only max(n2, mr + nr) real values need to be saved. With the assumption that known entries are uniformly distributed in Y, submatrices formed by known entries are randomly selected from Y with statistical size k x npk or mpl x l, where k or l takes r + 1 usually. We propose and prove a theorem, under random noises the probability that the subspace associated with a smaller singular value will turn into the space associated to anyone of the r largest singular values is smaller. Based on the theorem, the npk - k null vectors or the l - r right singular vectors associated with the minor singular values are calculated for each submatrix. The vectors ought to be the null vectors of the submatrix formed by the chosen npk or l columns of the ground truth of V̂T. If enough submatrices are randomly chosen, V̂ and Û can be estimated accordingly. The experimental results on random synthetic matrices with sizes such as 131072 x 1024 and on real data sets such as dinosaur indicate that RSM is 4.30 ~ 197.95 times faster than the state-of-the-art algorithms. It, meanwhile, has considerable high precision achieving or approximating to the best.