Abstract:Abstract Motivation Single-cell RNA sequencing (scRNA-seq) is an increasingly popular technique for transcriptomic analysis of gene expression at the single-cell level. Cell-type clustering is the first crucial task in the analysis of scRNA-seq data that facilitates accurate identification of cell types and the study of the characteristics of their transcripts. Recently, several computational models based on a deep autoencoder and the ensemble clustering have been developed to analyze scRNA-seq data. However, current deep autoencoders are not sufficient to learn the latent representations of scRNA-seq data, and obtaining consensus partitions from these feature representations remains under-explored. Results To address this challenge, we propose a single-cell deep clustering model via a dual denoising autoencoder with bipartite graph ensemble clustering called scBGEDA, to identify specific cell populations in single-cell transcriptome profiles. First, a single-cell dual denoising autoencoder network is proposed to project the data into a compressed low-dimensional space and that can learn feature representation via explicit modeling of synergistic optimization of the zero-inflated negative binomial reconstruction loss and denoising reconstruction loss. Then, a bipartite graph ensemble clustering algorithm is designed to exploit the relationships between cells and the learned latent embedded space by means of a graph-based consensus function. Multiple comparison experiments were conducted on 20 scRNA-seq datasets from different sequencing platforms using a variety of clustering metrics. The experimental results indicated that scBGEDA outperforms other state-of-the-art methods on these datasets, and also demonstrated its scalability to large-scale scRNA-seq datasets. Moreover, scBGEDA was able to identify cell-type specific marker genes and provide functional genomic analysis by quantifying the influence of genes on cell clusters, bringing new insights into identifying cell types and characterizing the scRNA-seq data from different perspectives. Availability and implementation The source code of scBGEDA is available at https://github.com/wangyh082/scBGEDA. The software and the supporting data can be downloaded from https://figshare.com/articles/software/scBGEDA/19657911. Supplementary information Supplementary data are available at Bioinformatics online.

DAE-TPGM: A deep autoencoder network based on a two-part-gamma model for analyzing single-cell RNA-seq data

AE-TPGG: a novel autoencoder-based approach for single-cell RNA-seq data imputation and dimensionality reduction

Single-cell RNA-seq Denoising Using a Deep Count Autoencoder

Deep autoencoder for interpretable tissue-adaptive deconvolution and cell-type-specific gene analysis

DSAE-Impute: Learning Discriminative Stacked Autoencoders for Imputing Single-cell RNA-seq Data

Uncovering the Key Dimensions of High-Throughput Biomolecular Data Using Deep Learning.

scGMAAE: Gaussian mixture adversarial autoencoders for diversification analysis of scRNA-seq data

Improved downstream functional analysis of single-cell RNA-sequence data using DGAN

scDAC: deep adaptive clustering of single-cell transcriptomic data with coupled autoencoder and Dirichlet process mixture model

Graph embedding and Gaussian mixture variational autoencoder network for end-to-end analysis of single-cell RNA sequencing data

scBGEDA: deep single-cell clustering analysis via a dual denoising autoencoder with bipartite graph ensemble clustering

Sparsity-Penalized Stacked Denoising Autoencoders for Imputing Single-Cell RNA-seq Data

A deep auto-encoder model for gene expression prediction

An Autoencoder-Based Deep Learning Method for Genotype Imputation

DP-DCAN: Differentially Private Deep Contrastive Autoencoder Network for Single-cell Clustering

scSemiAE: a deep model with semi-supervised learning for single-cell transcriptomics

An End-to-End Deep Hybrid Autoencoder Based Method for Single-Cell RNA-Seq Data Analysis

Optimization and Redevelopment of Single-Cell Data Analysis Workflow Based on Deep Generative Models

ScDA: A Denoising AutoEncoder Based Dimensionality Reduction for Single-cell RNA-seq Data

High-throughput Single-Cell RNA-seq Data Imputation and Characterization with Surrogate-Assisted Automated Deep Learning

scMAE: a masked autoencoder for single-cell RNA-seq clustering