Identifying Characteristic Genes And Clustering Via An L-P-Norm Robust Feature Selection Method For Integrated Data

Shasha Wu,Mi-Xiao Hou,Jin-Xing Liu,Juan Wang,Shasha Yuan
DOI: https://doi.org/10.1007/978-3-319-95933-7_51
2018-01-01
Abstract:In bioinformatics, feature selection is a good method for dimensionality reduction and has been widely used. However, the model of traditional feature selection method: Joint Embedding Learning and Sparse Regression (JELSR), whose the error term is in the form of a square term, which leads to the algorithm becoming extremely sensitive to noise and outliers and degrading the performance of the algorithm. Considering the above problem, we propose a new robust feature selection model by adding an L-p-norm constraint on error term, and name it as RJELSR, which improves the robustness of the algorithm. And we give an efficacious optimization strategy based on the augmented Lagrange multiplier method to get the optimal results. In the experimental section, we first preprocess different cancer data to obtain the integrated data, and then apply it to our algorithm for feature selection and sample clustering. Experiments on integrated data demonstrate that the performance of our method is superior to other compared methods and the selected characteristic genes are more biologically meaningful.
What problem does this paper attempt to address?