CellPatch: a Highly Efficient Foundation Model for Single-Cell Transcriptomics with Heuristic Patching

Hua-Jun Wu,Xiaoqi Zheng,Zeyu Ma,Hanwen Zhu,Yushun Yuan,Jiyuan Yang,Kangwen Cai,Nana Wei,Senxin Zhang,Lu Wang,Jiang Wenjie,Yuanchen Sun,Yu-Juan Wang,An Liu,Futing Lai
DOI: https://doi.org/10.1101/2024.11.15.623701
2024-11-17
Abstract:The rapid advancement of foundation models has significantly enhanced the analysis of single-cell omics data, enabling researchers to gain deeper insights into the complex interactions between cells and genes across diverse tissues. However, existing foundation models often exhibit excessive complexity, hindering their practical utility for downstream tasks. Here, we present CellPatch, a lightweight foundation model that leverages the strengths of the cross-attention mechanism and patch tokenization to reduce model complexity while extracting efficient biological representations. Comprehensive evaluations conducted on single-cell RNA-sequencing datasets across multiple organs and tissue states demonstrate that CellPatch achieves state-of-the-art performance in downstream analytical tasks while maintaining ultra-low computational costs during both pretraining and finetuning phases. Moreover, the flexibility and scalability of CellPatch allow it to serve as a general framework that can be incorporated with other well established single-cell analysis software, thereby enhancing their performance through transfer learning on diverse downstream tasks.
Bioinformatics
What problem does this paper attempt to address?