FISHFactor: A Probabilistic Factor Model for Spatial Transcriptomics Data with Subcellular Resolution

Oliver Stegle,Britta Velten,Florin Walter
DOI: https://doi.org/10.1101/2021.11.04.467354
2021-11-05
Abstract:Abstract Factor analysis is a widely-used method for dimensionality reduction of high-throughput datasets in molecular biology and has recently been adapted to spatial transcriptomics data. However, existing methods assume (count) matrices as input and are therefore not directly applicable to single-molecule resolved data, which increasingly arise for example from multiplexed fluorescence in-situ hybridization or in-situ sequencing experiments. To address this, we here propose FISHFactor, a probabilistic model that combines the benefits of spatial, non-negative factor analysis with a Poisson point process likelihood to explicitly model and account for the nature of single-molecule resolved data. FISHFactor furthermore leverages principles of multi-modal factor analysis to enable dissecting the transcriptional heterogeneity between multiple groups of samples, such as different cells. Using simulated and real data, we show that our approach leads to improved estimates of the true spatial transcriptome landscape compared to existing methods that rely on aggregating information by spatial binning. Applied to a set of NIH/3T3 cells, FISHFactor identifies major subcellular expression patterns and accurately recovers known spatial gene clusters.
What problem does this paper attempt to address?