Global Feature Selection from Microarray Data Using Lagrange Multipliers

Shiquan Sun,Qinke Peng,Xiaokang Zhang
DOI: https://doi.org/10.1016/j.knosys.2016.07.035
IF: 8.139
2016-01-01
Knowledge-Based Systems
Abstract:In microarray-based gene expression analysis, thousands of genes are involved to monitor their expression levels under a particular condition. In fact, however, only few of them are highly expressed, which has been proven by Golub et al. How to identify these discriminative genes effectively is a significant challenge to risk assessment, diagnosis, prognostication in growing cancer incidence and mortality. In this paper, we present a global feature selection method based on semidefinite programming model which is relaxed from the quadratic programming model with maximizing feature relevance and minimizing feature redundancy. The main advantage of relaxation is that the matrix in mathematical model only requires symmetric matrix rather than positive (or semi) definite matrix. In semidefinite programming model, each feature has one constraint condition to restrict the objective function of feature selection problem. Herein, another trick in this paper is that we utilize Lagrange multiplier as proxy measurement to identify the discriminative features instead of solving a feasible solution for the original max-cut problem. The proposed method is compared with several popular feature selection methods on seven microarray data sets. The results demonstrate that our method outperforms the others on most data sets, especially for the two hard feature selection data sets, Beast(Wang) and Medulloblastoma. (C) 2016 Elsevier B.V. All rights reserved.
What problem does this paper attempt to address?