Abstract:Probabilistic models have provided the underpinnings for state-of-the-art performance in many single-cell omics data analysis tasks, including dimensionality reduction, clustering, differential expression, annotation, removal of unwanted variation, and integration across modalities. Many of the models being deployed are amenable to scalable stochastic inference techniques, and accordingly they are able to process single-cell datasets of realistic and growing sizes. However, the community-wide adoption of probabilistic approaches is hindered by a fractured software ecosystem resulting in an array of packages with distinct, and often complex interfaces. To address this issue, we developed scvi-tools (), a Python package that implements a variety of leading probabilistic methods. These methods, which cover many fundamental analysis tasks, are accessible through a standardized, easy-to-use interface with direct links to Scanpy, Seurat, and Bioconductor workflows. By standardizing the implementations, we were able to develop and reuse novel functionalities across different models, such as support for complex study designs through nonlinear removal of unwanted variation due to multiple covariates and reference-query integration via scArches. The extensible software building blocks that underlie scvi-tools also enable a developer environment in which new probabilistic models for single cell omics can be efficiently developed, benchmarked, and deployed. We demonstrate this through a code-efficient reimplementation of Stereoscope for deconvolution of spatial transcriptomics profiles. By catering to both the end user and developer audiences, we expect scvi-tools to become an essential software dependency and serve to formulate a community standard for probabilistic modeling of single cell omics. ### Competing Interest Statement O.C. is supported by the EPSRC Centre for Doctoral Training in Modern Statistics and Statistical Machine Learning (EP/S023151/1) and Novo Nordisk. V.S. is a full-time employee of Serqet Therapuetics and has ownership interest in Serqet Therapeutics. F.J.T. reports receiving consulting fees from Roche Diagnostics GmbH and Cellarity Inc., and ownership interest in Cellarity, Inc.

Interpretable factor models of single-cell RNA-seq via variational autoencoders

Learning interpretable latent autoencoder representations with annotations of feature sets

scVAE: variational auto-encoders for single-cell gene expression data

ScInfoVAE: interpretable dimensional reduction of single cell transcription data with variational autoencoders and extended mutual information regularization

Biophysical modeling with variational autoencoders for bimodal, single-cell RNA sequencing data

Interpretable models for scRNA-seq data embedding with multi-scale structure preservation

Scalable probabilistic matrix factorization for single-cell RNA-seq analysis

Using Multi-Encoder Semi-Implicit Graph Variational Autoencoder to Analyze Single-Cell RNA Sequencing Data

Automatic identification of relevant genes from low-dimensional embeddings of single-cell RNA-seq data.

A deep generative model for single-cell RNA sequencing with application to detecting differentially expressed genes

f-scLVM: scalable and versatile factor analysis for single-cell RNA-seq

Biological network-inspired interpretable variational autoencoder

Parameter tuning is a key part of dimensionality reduction via deep variational autoencoders for single cell RNA transcriptomics

FISHFactor: A Probabilistic Factor Model for Spatial Transcriptomics Data with Subcellular Resolution

scVGAE: A Novel Approach using ZINB-Based Variational Graph Autoencoder for Single-Cell RNA-Seq Imputation

A deep generative model for gene expression profiles from single-cell RNA sequencing

Out-of-distribution Prediction with Disentangled Representations for Single-Cell RNA Sequencing Data

Scvi-Tools: a Library for Deep Probabilistic Analysis of Single-Cell Omics Data

VEGA is an interpretable generative model for inferring biological network activity in single-cell transcriptomics

Deep Generative Autoencoder for Low-Dimensional Embeding Extraction from Single-Cell RNAseq Data

VASC: Dimension Reduction and Visualization of Single-cell RNA-seq Data by Deep Variational Autoencoder.