A framework for gene representation on spatial transcriptomics

Shenghao Cao,Ye Yuan
DOI: https://doi.org/10.1101/2024.09.27.615337
2024-11-06
Abstract:Recent foundation models for single-cell transcriptomics data generate informative, context-aware gene representations. The spatially resolved transcriptomics data offer extra positional insights, yet corresponding gene representation methods that integrate both intracellular and spatial context are lacking. Here, we introduce a gene representation framework tailored for spatial transcriptomics data. It incorporates ligand genes within the spatial niche into the Transformer encoder of single-cell transcriptomics. We further propose a biased cross-attention method to extend the framework for whole-transcriptome but low-resolution Visium data. We pretrained the framework on a hybrid dataset from two human tissue types with distinct developmental and disease states, and tested on various downstream applications. Compared with the latest foundation model for single-cell transcriptomics, our spatially informed gene representations more consistently clustered cells, more accurately encoded hierarchy and membership within receptor-dependent gene networks, remarkably boosted identification of ligand-receptor interaction pairs, and could simulate perturbation effects of ligand-receptor interactions on downstream targets.
Bioinformatics
What problem does this paper attempt to address?