Exploiting Local Coherent Patterns for Unsupervised Feature Ranking

Qinghua Huang,Dacheng Tao,Xuelong Li,Lianwen Jin,Gang Wei
DOI: https://doi.org/10.1109/tsmcb.2011.2151256
2011-01-01
Abstract:Prior to pattern recognition, feature selection is often used to identify relevant features and discard irrelevant ones for obtaining improved analysis results. In this paper, we aim to develop an unsupervised feature ranking algorithm that evaluates features using discovered local coherent patterns, which are known as biclusters. The biclusters (viewed as submatrices) are discovered from a data matrix. These submatrices are used for scoring relevant features from two aspects, i.e., the interdependence of features and the separability of instances. The features are thereby ranked with respect to their accumulated scores from the total discovered biclusters before the pattern classification. Experimental results show that this proposed method can yield comparable or even better performance in comparison with the well-known Fisher score, Laplacian score, and variance score using three UCI data sets, well improve the results of gene expression data analysis using gene ontology annotation, and finally demonstrate its advantage of unsupervised feature ranking for high-dimensional data.
What problem does this paper attempt to address?