SEraster: a rasterization preprocessing framework for scalable spatial omics data analysis

Gohta Aihara,Kalen Clifton,Mayling Chen,Zhuoyan Li,Lyla Atta,Brendan F Miller,Rahul Satija,John W Hickey,Jean Fan
DOI: https://doi.org/10.1093/bioinformatics/btae412
IF: 5.8
2024-07-01
Bioinformatics
Abstract:Motivation: Spatial omics data demand computational analysis but many analysis tools have computational resource requirements that increase with the number of cells analyzed. This presents scalability challenges as researchers use spatial omics technologies to profile millions of cells. Results: To enhance the scalability of spatial omics data analysis, we developed a rasterization preprocessing framework called SEraster that aggregates cellular information into spatial pixels. We apply SEraster to both real and simulated spatial omics data prior to spatial variable gene expression analysis to demonstrate that such preprocessing can reduce computational resource requirements while maintaining high performance, including as compared to other down-sampling approaches. We further integrate SEraster with existing analysis tools to characterize cell-type spatial co-enrichment across length scales. Finally, we apply SEraster to enable analysis of a mouse pup spatial omics dataset with over a million cells to identify tissue-level and cell-type-specific spatially variable genes as well as spatially co-enriched cell types that recapitulate expected organ structures. Availability and implementation: SEraster is implemented as an R package on GitHub (https://github.com/JEFworks-Lab/SEraster) with additional tutorials at https://JEF.works/SEraster.
What problem does this paper attempt to address?