Abstract:Nowadays DNA microarray technology is widely used in clinical researches for generating gene expression profiles from the biological samples. Based on the gene expression data, identifying differentially expressed genes (DEGs) from two groups of phenotypes or distinct biological conditions is one of the crucial steps in the procedure of discovering disease biomarkers. However, the clinical samples usually contain multiple cell types. This heterogeneous cell population significantly affects the gene expression patterns and will mask the biological difference between two groups of compared samples. Using mixed gene expression profile of multiple cell types instead of that of interested cell type for the identification of DEGs will seriously decrease the sensitivity of discovering the disease-related genes. Therefore, we proposed nonnegative matrix factorization (NMF), an unsupervised learning method that has been successfully applied in bioinformatics researches, for extracting the actual gene expression profile of interested cell type from the mixed profile of heterogeneous cell population. In our study, we firstly evaluated the performance of NMF algorithm in the deconvolution of gene expression data by using a well-controlled data set comprising the gene expression profiles from three tissues and eleven different mixtures with known proportions. Then, NMF was applied to the human whole-blood gene expression data generated from 24 kidney transplant recipients for estimating the pure gene expression profiles of five major blood cells, which were subsequently used to identify the genes related to the acute rejection of kidney transplant. The results showed that the number of DEGs (probe sets), which were identified from each of the gene expression profiles of five blood cells between stable post-transplant kidney transplant recipients and those experiencing acute transplant rejections, was greater than that from whole-blood samples. Finally, the DEGs were uploaded to the Gene Set Enrichment Analysis (GSEA) for the enrichment of signaling pathways and gene ontology terms. We found that several enriched pathways and gene ontology terms were significantly associated with renal transplantation rejection when the uploaded DEGs were identified from the two high content blood cells, while none of pathways and gene ontology terms was enriched when the uploaded DEGs were identified from whole-blood samples. Our results indicated that using the gene expression profile of specific cell type deconvoluted by NMF can efficiently increase the sensitivity of discovering potentially disease-related genes. In addition, this unsupervised method can evaluate the pure gene expression profile of specific cell type from the mixtures with no prior knowledge of cell proportions.

Jnmfma: a Joint Non-Negative Matrix Factorization Meta-Analysis of Transcriptomics Data

Joint Nonnegative Matrix Factorization Based on Sparse and Graph Laplacian Regularization for Clustering and Co-Differential Expression Genes Analysis.

JSNMF enables effective and accurate integrative analysis of single-cell multiomics data

Regularized Non-Negative Matrix Factorization for Identifying Differentially Expressed Genes and Clustering Samples: A Survey.

Flexible Non-Negative Matrix Factorization to Unravel Disease-Related Genes

M6adecom: Analysis of M6a Profile Matrix Based on Graph Regularized Non-Negative Matrix Factorization.

Nonnegative matrix factorization for the improvement in sensitivity of discovering potentially disease-related genes

A regulation probability model-based meta-analysis of multiple transcriptomics data sets for cancer biomarker identification

MetaTX: deciphering the distribution of mRNA-related features in the presence of isoform ambiguity, with applications in epitranscriptome analysis

Extracting Characteristic Patterns from Genome-Wide Expression Data by Non-Negative Matrix Factorization

Detecting Heterogeneity in Single-Cell RNA-Seq Data by Non-Negative Matrix Factorization.

A Graph Regularized Non-Negative Matrix Factorization Method for Identifying Microrna-Disease Associations

Mining Functional Modules by Multiview-NMF of Phenome-Genome Association

Post‐modified Non‐negative Matrix Factorization for Deconvoluting the Gene Expression Profiles of Specific Cell Types from Heterogeneous Clinical Samples Based on RNA‐sequencing Data

Unsupervised Cluster Analysis and Gene Marker Extraction of scRNA-seq Data Based On Non-Negative Matrix Factorization

JOINT for large-scale single-cell RNA-sequencing analysis via soft-clustering and parallel computing

Gene Ranking of RNA-Seq Data Via Discriminant Non-Negative Matrix Factorization

Regularized Non-Negative Matrix Factorization for Identifying Differentially Expressed Genes and Clustering Samples

Fast and interpretable non-negative matrix factorization for atlas-scale single cell data

scMNMF: a novel method for single-cell multi-omics clustering based on matrix factorization

An Efficient Nonnegative Matrix Factorization Model for Finding Cancer Associated Genes by Integrating Data from Genome, Transcriptome and Interactome.