Gene selection and cancer classification using Monte Carlo and nonnegative matrix factorization

Jing Chen,Qin Ma,Xiaoyan Hu,Miao Zhang,Dongdong Qin,Xiaoquan Lu
DOI: https://doi.org/10.1039/c6ra05694f
IF: 4.036
2016-01-01
RSC Advances
Abstract:Cancer classification is a key problem for identifying the genomic biomarkers and treating cancerous tumors in clinical research. The gene data in gene expression profiling are potential biomarkers and can be used to classify cancer samples. However, with the high dimensionality of the gene data, the cancer samples are difficult to classify. The identification of the significant genes is critical for the classification. To identify the significant genes, nonnegative matrix factorization (NMF) uses the sparse basis vectors of the gene data to represent gene information. However, the basis vectors with the imposed sparseness lose much of the useful information in the gene data. To more effectively represent the useful information, a method named Monte Carlo-nonnegative matrix factorization (MC-NMF) is proposed by using Monte Carlo technique in this study. The method is used to classify two cancer samples. The results show that the method can effectively estimate the significance of the genes and classify cancer samples with a high accuracy.
What problem does this paper attempt to address?