Lixin Lei,Kaitai Han,Zijun Wang,Chaojing Shi,Zhenghui Wang,Ruoyan Dai,Zhiwei Zhang,Mengqiu Wang,Qianjin Guo
Abstract:Abstract The latest breakthroughs in spatially resolved transcriptomics technology offer comprehensive opportunities to delve into gene expression patterns within the tissue microenvironment. However, the precise identification of spatial domains within tissues remains challenging. In this study, we introduce AttentionVGAE (AVGN), which integrates slice images, spatial information and raw gene expression while calibrating low-quality gene expression. By combining the variational graph autoencoder with multi-head attention blocks (MHA blocks), AVGN captures spatial relationships in tissue gene expression, adaptively focusing on key features and alleviating the need for prior knowledge of cluster numbers, thereby achieving superior clustering performance. Particularly, AVGN attempts to balance the model’s attention focus on local and global structures by utilizing MHA blocks, an aspect that current graph neural networks have not extensively addressed. Benchmark testing demonstrates its significant efficacy in elucidating tissue anatomy and interpreting tumor heterogeneity, indicating its potential in advancing spatial transcriptomics research and understanding complex biological phenomena.
What problem does this paper attempt to address?
### Problems the paper attempts to solve
The paper aims to address the challenge of accurately identifying spatial domains within tissues in spatial transcriptomics technology. Specifically, the existing non - spatial methods and technological fluctuations lead to inaccuracies in spatial domain identification. For example, the K - means and Louvain algorithms perform poorly when dealing with technological fluctuations, and the method based on convolutional single - cell RNA sequencing also has deficiencies in achieving cell - or sub - cell - level resolution. These methods usually rely on gene expression data as input, but because cells need to be disrupted to prepare samples during the experimental process, the positional information of cells is lost, making the relationship between the clustering results and the actual tissue slice structure not closely related.
To overcome these problems, the authors introduce a new model - Attention - Guided Variational Graph Auto - Encoder (AttentionVGAE, abbreviated as AVGN). By combining the Multi - Head Attention mechanism (MHA blocks) and Variational Graph Auto - Encoder (VGAE), this model can better capture the spatial relationships of gene expression within tissues, adaptively focus on key features, and does not need to know the number of clusters in advance. In addition, AVGN also attempts to balance the model's attention to local and global structures, thereby improving clustering performance.
### Specific objectives
1. **Integrate multiple data sources**: AVGN integrates slice images, spatial information, and raw gene expression data to correct low - quality gene expression.
2. **Capture spatial relationships**: By combining VGAE and the Multi - Head Attention mechanism, AVGN can capture the spatial relationships of gene expression within tissues and adaptively focus on key features.
3. **No prior knowledge required**: AVGN can perform clustering without prior knowledge, improving the flexibility and effectiveness of the clustering algorithm.
4. **Balance local and global structures**: Using the Multi - Head Attention mechanism, AVGN can balance the attention to local and global structures, thereby improving clustering performance.
5. **Improve clustering accuracy**: Through benchmark tests, verify the clustering performance of AVGN on multiple spatial transcriptomics datasets, especially its advantages in revealing tissue anatomical structures and explaining tumor heterogeneity.
### Method overview
- **Data pre - processing**: Use the pre - trained MaxVit model to extract features of slice images, and combine the original gene expression data and spatial position information to construct an enhanced gene expression matrix.
- **Encoder**: Extract latent representations through the Graph Convolutional Encoder (GCN) and the Multi - Head Attention mechanism.
- **Reparameterization**: Generate latent variables Z for the subsequent decoding process.
- **Decoder**: Reconstruct features through the decoder to generate an adjacency matrix.
- **Performance evaluation**: Use the Adjusted Rand Index (ARI) and Normalized Mutual Information (NMI) to evaluate clustering performance.
### Results
- **DLPFC dataset**: The clustering results of AVGN on the DLPFC dataset show higher ARI values, and the UMAP visualization shows that the clustering points are more concentrated and the boundaries are clearer.
- **Coronal slices of adult mouse brains**: When detecting the fine - structure positions of coronal slices of adult mouse brains, AVGN shows clear and smooth boundaries, which are highly correlated with the expression patterns of highly variable genes.
- **Adult glioblastoma**: AVGN shows the highest average ARI value on the adult glioblastoma dataset, and the UMAP visualization shows that the clustering results are better than those of other methods.
Through these results, AVGN shows great potential in spatial transcriptomics research, especially in revealing tissue anatomical structures and explaining tumor heterogeneity.