Accelerated dimensionality reduction of single -cell RNA sequencing data with fastglmpca

Eric Weine,Peter Carbonetto,Matthew Stephens
DOI: https://doi.org/10.1101/2024.03.23.586420
2024-07-04
Abstract:Motivated by theoretical and practical issues that arise when applying Principal Components Analysis (PCA) to count data, Townes et al introduced ''Poisson GLM-PCA'', a variation of PCA adapted to count data, as a tool for dimensionality reduction of single-cell RNA sequencing (RNA-seq) data. However, fitting GLM-PCA is computationally challenging. Here we study this problem, and show that a simple algorithm, which we call ``Alternating Poisson Regression'' (APR), produces better quality fits, and in less time, than existing algorithms. APR is also memory-efficient, and lends itself to parallel implementation on multi-core processors, both of which are helpful for handling large single-cell RNA-seq data sets. We illustrate the benefits of this approach in two published single-cell RNA-seq data sets. The new algorithms are implemented in an R package, fastglmpca.
Bioinformatics
What problem does this paper attempt to address?