Abstract:Motivation: Single-cell RNA-seq analysis has emerged as a powerful tool for understanding inter-cellular heterogeneity. Due to the inherent noise of the data, computational techniques often rely on dimensionality reduction (DR) as both a pre-processing step and an analysis tool. Ideally, DR should preserve the biological information while discarding the noise. However, if the DR is to be used directly to gain biological insight it must also be interpretable-that is the individual dimensions of the reduction should correspond to specific biological variables such as cell-type identity or pathway activity. Maximizing biological interpretability necessitates making assumption about the data structures and the choice of the model is critical. Results: We present a new probabilistic single-cell factor analysis model, Non-negative Independent Factor Analysis (NIFA), that incorporates different interpretability inducing assumptions into a single modeling framework. The key advantage of our NIFA model is that it simultaneously models uni- and multi-modal latent factors, and thus isolates discrete cell-type identity and continuous pathway activity into separate components. We apply our approach to a range of datasets where cell-type identity is known, and we show that NIFA-derived factors outperform results from ICA, PCA, NMF and scCoGAPS (an NMF method designed for single-cell data) in terms of disentangling biological sources of variation. Studying an immunotherapy dataset in detail, we show that NIFA is able to reproduce and refine previous findings in a single analysis framework and enables the discovery of new clinically relevant cell states. Availability and implementation: NFIA is a R package which is freely available at GitHub (https://github.com/wgmao/NIFA). The test dataset is archived at https://zenodo.org/record/6286646. Supplementary information: Supplementary data are available at Bioinformatics online.

Fast and interpretable non-negative matrix factorization for atlas-scale single cell data

Optimization and expansion of non-negative matrix factorization

A fast and efficient count-based matrix factorization method for detecting cell types from single-cell RNAseq data

Extracting Characteristic Patterns from Genome-Wide Expression Data by Non-Negative Matrix Factorization

Detecting Heterogeneity in Single-Cell RNA-Seq Data by Non-Negative Matrix Factorization.

Non-negative Matrix-Set Factorization

BANMF-S: a blockwise accelerated non-negative matrix factorization framework with structural network constraints for single cell imputation

Analyzing Single Cell RNA Sequencing with Topological Nonnegative Matrix Factorization

Libnmf - A Library for Nonnegative Matrix Factorization

Efficient Nonnegative Matrix Factorization Via Projected Newton Method.

Sparse nonnegative matrix factorization applied to microarray data sets

Constrained non-negative matrix factorization enabling real-time insights of $\textit{in situ}$ and high-throughput experiments

Non-negative Independent Factor Analysis disentangles discrete and continuous sources of variation in scRNA-seq data

scMNMF: a novel method for single-cell multi-omics clustering based on matrix factorization

FastNMF: Highly Efficient Monotonic Fixed-Point Nonnegative Matrix Factorization Algorithm with Good Applicability

The non-negative matrix factorization toolbox for biological data mining

Graph-Regularized Non-Negative Matrix Factorization for Single-Cell Clustering in scRNA-Seq Data

Fastnmf: A Fast Monotonic Fixed-Point Non-Negative Matrix Factorization Algorithm With High Ease Of Use

Nonnegative Singular Value Decomposition For Microarray Data Analysis Of Spermatogenesis

Robust Structured Convex Nonnegative Matrix Factorization for Data Representation