Double Robust Principal Component Analysis.

Qianqian Wang,QuanXue Gao,Gan Sun,Chris Ding
DOI: https://doi.org/10.1016/j.neucom.2020.01.097
IF: 6
2020-01-01
Neurocomputing
Abstract:Robust Principal Component Analysis (RPCA) aiming to recover underlying clean data with low-rank structure from the corrupted data, is a powerful tool in machine learning and data mining. However, in many real-world applications where new data (i.e., out-of-samples) in the testing phase can be unseen in the training procedure, (1) RPCA which is a transductive method can be naturally incapable of handing out-of-samples, and (2) violently applying RPCA into this applications does not explicitly consider the relationships between reconstruction error and low-rank representation. To tackle these problems, in this paper, we propose a Double Robust Principal Component Analysis to deal with the out-of-sample problems, which is termed as DRPCA. More specifically, we integrate a reconstruction error into the criterion function of RPCA. Our proposed model can then benefit from (1) the robustness of principal components to outliers and missing values, (2) the bridge between reconstruction error and low-rank representation, (3) low-rank clean data extraction from new datum by a linear transform. To this end, extensive experiments on several datasets demonstrate its superiority, when comparing with the state-of-the-art models, in several clustering and low-rank recovery tasks.
What problem does this paper attempt to address?