A Filter Feature Selection Method Based LLRFC and Redundancy Analysis for Tumor Classification Using Gene Expression Data

Jiangeng Li,Xiaodan Li,Wei Zhang
DOI: https://doi.org/10.1109/wcica.2016.7578590
2016-01-01
Abstract:Tumor gene expression data has the characteristic of high dimensionality and small sample size, which pose a rigorous challenge for tumor classification. Since not all the genes are associated with tumor phenotypes, the irrelevant features seriously reduce the learning performance. It is necessary to select relevant features from the original data. In this paper, we propose a new filter feature selection method based on the graph embedding framework for manifold learning, which is named as LLRFC score. The relationship between sample classes and features is considered in this method. But the selected features via this method may contain some redundancy. Thus it is improved through eliminating redundancy among the features. The improved method is named LLRFC score+. Several other feature selection approaches are used to compare with our method on nine public tumor gene expression datasets, the experimental results demonstrate that our presented method is quite promising and valid for tumor classification.
What problem does this paper attempt to address?