Sparse P-Norm Nonnegative Matrix Factorization for Clustering Gene Expression Data.

Weixiang Liu,Kehong Yuan
DOI: https://doi.org/10.1504/ijdmb.2008.020524
2008-01-01
International Journal of Data Mining and Bioinformatics
Abstract:Nonnegative Matrix Factorization (NMF) is a powerful tool forgene expression data analysis as it reduces thousands of genes to afew compact metagenes, especially in clustering gene expressionsamples for cancer class discovery. Enhancing sparseness of thefactorisation can find only a few dominantly coexpressed metagenesand improve the clustering effectiveness. Sparse p-norm (p 1)Nonnegative Matrix Factorization (sp-NMF) is a more sparserepresentation method using high order norm to normalise thedecomposed components. In this paper, we investigate the benefit ofhigh order normalisation for clustering cancer-related geneexpression samples. Experimental results demonstrate that sp-NMFleads to robust and effective clustering in both automaticallydetermining the cluster number, and achieving high accuracy.
What problem does this paper attempt to address?