STGIC: A graph and image convolution-based method for spatial transcriptomic clustering

Chen Zhang,Junhui Gao,Hong-Yu Chen,Lingxin Kong,Guangshuo Cao,Xiangyu Guo,Wei Liu,Bin Ren,Dong-Qing Wei
DOI: https://doi.org/10.1371/journal.pcbi.1011935
2024-02-29
PLoS Computational Biology
Abstract:Spatial transcriptomic (ST) clustering employs spatial and transcription information to group spots spatially coherent and transcriptionally similar together into the same spatial domain. Graph convolution network (GCN) and graph attention network (GAT), fed with spatial coordinates derived adjacency and transcription profile derived feature matrix are often used to solve the problem. Our proposed method STGIC ( s patial t ranscriptomic clustering with g raph and i mage c onvolution) is designed for techniques with regular lattices on chips. It utilizes an adaptive graph convolution (AGC) to get high quality pseudo-labels and then resorts to dilated convolution framework (DCF) for virtual image converted from gene expression information and spatial coordinates of spots. The dilation rates and kernel sizes are set appropriately and updating of weight values in the kernels is made to be subject to the spatial distance from the position of corresponding elements to kernel centers so that feature extraction of each spot is better guided by spatial distance to neighbor spots. Self-supervision realized by Kullback–Leibler (KL) divergence, spatial continuity loss and cross entropy calculated among spots with high confidence pseudo-labels make up the training objective of DCF. STGIC attains state-of-the-art (SOTA) clustering performance on the benchmark dataset of 10x Visium human dorsolateral prefrontal cortex (DLPFC). Besides, it's capable of depicting fine structures of other tissues from other species as well as guiding the identification of marker genes. Also, STGIC is expandable to Stereo-seq data with high spatial resolution. Spatial transcriptomics detect gene transcription profile with high spatial resolution even to sub-cellular level, which is very helpful to characterize organization and architecture of tissues and find marker genes corresponding to sub-structures. Therefore, it is much more informative than traditional transcriptomics for interpreting the biological processes underlying tissue development and disease progression. Clustering to clarify each spatial domain of tissues in which cells present similar transcription profile and histology is a primary task for analyzing spatial transcriptomics data. To solve the important problem, we propose the method STGIC combining deep learning skills in both graph and image, since we notice that image methods tend to display good performances on some samples in benchmark datasets which existing graph methods are not good at, and vice versa. Besides, we substitute the vanilla form of graph convolution frequently used in this field with the one which can find the optimal order of neighbor involved in feature aggregation adaptively for different graphs. STGIC displays high clustering performance and identifies spatial domains which can be used to depict fine-grained structure and help ascertain marker genes of different tissues from different species sequenced with 10x Visum and Stereo-seq techniques.
biochemical research methods,mathematical & computational biology
What problem does this paper attempt to address?