scGMAI: a Gaussian mixture model for clustering single-cell RNA-Seq data based on deep autoencoder

Bin Yu,Chen Chen,Ren Qi,Ruiqing Zheng,Patrick J Skillman-Lawrence,Xiaolin Wang,Anjun Ma,Haiming Gu
DOI: https://doi.org/10.1093/bib/bbaa316
IF: 9.5
2020-12-10
Briefings in Bioinformatics
Abstract:Abstract The rapid development of single-cell RNA sequencing (scRNA-Seq) technology provides strong technical support for accurate and efficient analyzing single-cell gene expression data. However, the analysis of scRNA-Seq is accompanied by many obstacles, including dropout events and the curse of dimensionality. Here, we propose the scGMAI, which is a new single-cell Gaussian mixture clustering method based on autoencoder networks and the fast independent component analysis (FastICA). Specifically, scGMAI utilizes autoencoder networks to reconstruct gene expression values from scRNA-Seq data and FastICA is used to reduce the dimensions of reconstructed data. The integration of these computational techniques in scGMAI leads to outperforming results compared to existing tools, including Seurat, in clustering cells from 17 public scRNA-Seq datasets. In summary, scGMAI is an effective tool for accurately clustering and identifying cell types from scRNA-Seq data and shows the great potential of its applicative power in scRNA-Seq data analysis. The source code is available at https://github.com/QUST-AIBBDRC/scGMAI/.
biochemical research methods,mathematical & computational biology
What problem does this paper attempt to address?