Robust Unsupervised Feature Selection Via Dual Self-Representation and Manifold Regularization

Chang Tang,Xinwang Liu,Miaomiao Li,Pichao Wang,Jiajia Chen,Lizhe Wang,Wanqing Li
DOI: https://doi.org/10.1016/j.knosys.2018.01.009
IF: 8.139
2018-01-01
Knowledge-Based Systems
Abstract:Unsupervised feature selection has become an important and challenging pre-processing step in machine learning and data mining since large amount of unlabelled high dimensional data are often required to be processed. In this paper, we propose an efficient method for robust unsupervised feature selection via dual self-representation and manifold regularization, referred to as DSRMR briefly. On the one hand, a feature self-representation term is used to learn the feature representation coefficient matrix to measure the importance of different feature dimensions. On the other hand, a sample self-representation term is used to automatically learn the sample similarity graph to preserve the local geometrical structure of data which has been verified critical in unsupervised feature selection. By using l 2,1 -norm to regularize the feature representation residual matrix and representation coefficient matrix, our method is robustness to outliers, and the row sparsity of the feature coefficient matrix induced by l 2,1 -norm can effectively select representative features. During the optimization process, the feature coefficient matrix and sample similarity graph constrain each other to obtain optimal solution. Experimental results on ten real-world data sets demonstrate that the proposed method can effectively identify important features, outperforming many state-of-the-art unsupervised feature selection methods in terms of clustering accuracy (ACC) and normalized mutual information (NMI).
What problem does this paper attempt to address?