Post‐modified Non‐negative Matrix Factorization for Deconvoluting the Gene Expression Profiles of Specific Cell Types from Heterogeneous Clinical Samples Based on RNA‐sequencing Data

Yuan Liu,Yu Liang,Qifan Kuang,Fanfan Xie,Yingyi Hao,Zhining Wen,Menglong Li
DOI: https://doi.org/10.1002/cem.2929
IF: 2.5
2017-01-01
Journal of Chemometrics
Abstract:The application of supervised algorithms in clinical practice has been limited by the lack of information on pure cell types. Several supervised algorithms have been proposed to estimate the gene expression patterns of specific cell types from heterogeneous samples. Post-modified non-negative matrix factorization (NMF), the unsupervised algorithm we proposed here, is capable of estimating the gene expression profiles and contents of the major cell types in cancer samples without any prior reference knowledge. Post-modified NMF was first evaluated using simulation data sets and then applied to deconvolution of the gene expression profiles of cancer samples. It exhibited satisfactory performance with both the validation and application data. For application in 3 types of cancer, the differentially expressed genes (DEGs) identified from the deconvoluted gene expression profiles of tumor cells were highly associated with the cancer-related gene sets. Moreover, the estimated proportions of tumor cells showed significant difference between the 2 compared patient groups in clinical endpoints. Our results indicated that the post-modified NMF can efficiently extract the gene expression patterns of specific cell types from heterogeneous samples for subsequent analysis and prediction, which will greatly benefit clinical prognosis.
What problem does this paper attempt to address?