TopOMetry systematically learns and evaluates the latent dimensions of single-cell atlases

Davi Sidarta-Oliveira,Ana Domingos,Licio A Velloso
DOI: https://doi.org/10.1101/2022.03.14.484134
2024-06-04
Abstract:A core task in single-cell data analysis is recovering the latent dimensions encoding the genetic and epigenetic landscapes inhabited by cell types and lineages. However, consensus is lacking for optimal modeling and visualization approaches. Here, we propose these landscapes are ideally modeled as Riemannian manifolds and present TopOMetry, a computational toolkit based on Laplacian-type operators to learn these manifolds. TopOMetry systematically learns and evaluates dozens of possible representations, eliminating the need to choose a single dimensional reduction method a priori. The learned visualizations preserve more original information than current PCA-based standards across single-cell and non-biological datasets. TopOMetry allows users to estimate intrinsic dimensionalities and visualize distortions with the Riemannian metric, among other challenging tasks. Illustrating its hypothesis generation power, TopOMetry suggests the existence of dozens of novel T cell subpopulations consistently found across public datasets that correspond to specific clonotypes. TopOMetry is available at https://github.com/davisidarta/topometry.
Bioinformatics
What problem does this paper attempt to address?