Multi-modal domain adaptation for revealing spatial functional landscape from spatially resolved transcriptomics

Lequn Wang,Yaofeng Hu,Kai Xiao,Chuanchao Zhang,Qianqian Shi,Luonan Chen
DOI: https://doi.org/10.1093/bib/bbae257
IF: 9.5
2024-06-02
Briefings in Bioinformatics
Abstract:Spatially resolved transcriptomics (SRT) has emerged as a powerful tool for investigating gene expression in spatial contexts, providing insights into the molecular mechanisms underlying organ development and disease pathology. However, the expression sparsity poses a computational challenge to integrate other modalities (e.g. histological images and spatial locations) that are simultaneously captured in SRT datasets for spatial clustering and variation analyses. In this study, to meet such a challenge, we propose multi-modal domain adaption for spatial transcriptomics (stMDA), a novel multi-modal unsupervised domain adaptation method, which integrates gene expression and other modalities to reveal the spatial functional landscape. Specifically, stMDA first learns the modality-specific representations from spatial multi-modal data using multiple neural network architectures and then aligns the spatial distributions across modal representations to integrate these multi-modal representations, thus facilitating the integration of global and spatially local information and improving the consistency of clustering assignments. Our results demonstrate that stMDA outperforms existing methods in identifying spatial domains across diverse platforms and species. Furthermore, stMDA excels in identifying spatially variable genes with high prognostic potential in cancer tissues. In conclusion, stMDA as a new tool of multi-modal data integration provides a powerful and flexible framework for analyzing SRT datasets, thereby advancing our understanding of intricate biological systems.
biochemical research methods,mathematical & computational biology
What problem does this paper attempt to address?
### What problem does this paper attempt to solve? This paper aims to address the challenges brought by sparse expression and noise problems in Spatially Resolved Transcriptomics (SRT) data to multimodal data integration. Specifically, SRT technology can capture the spatial information of gene expression, which is crucial for studying the molecular mechanisms of organ development and disease pathology. However, due to technical limitations, SRT data usually has sparsity and noise, which pose computational challenges to spatial clustering and variation analysis. To solve this problem, the authors propose a multimodal unsupervised domain adaptation method - **stMDA (spatial transcriptomics Multi - modal Domain Adaptation)**, which integrates data from multiple modalities such as gene expression, histological images, and spatial locations to reveal spatial functional regions. Through this method, stMDA can not only correct low - quality gene expression data, but also improve the accuracy of spatial domain detection and perform well in data sets of multiple platforms and species. ### Main contributions of stMDA 1. **Multimodal representation learning**: stMDA uses multiple neural network architectures to learn modality - specific representations from spatial multimodal data. 2. **Deep spatial distribution alignment**: By aligning the global and local spatial distributions of different modality representations, the consistency of clustering results is enhanced. 3. **Joint representation and gene expression reconstruction**: Through the attention mechanism and decoder components, stMDA integrates multimodal representations into a joint representation and reconstructs the gene expression matrix. 4. **Excellent performance**: stMDA has demonstrated excellent performance on multiple SRT data sets, especially in identifying spatial domains and predicting prognostic genes in cancer tissues. ### Formula presentation To ensure the correctness and readability of the formulas, the following are the key formulas involved in the paper: - **Similarity matrix calculation**: \[ A_{ij}=\frac{D_{ij}}{\sum_{i = 0}^{N}D_{ij}}, \quad D_{i,j}=\exp\left(2-\frac{\langle U_i, U_j\rangle}{\|U_i\|\|U_j\|}\right) \] where the matrix \(U\in\mathbb{R}^{15\times N}\) is a low - dimensional matrix composed of 15 principal components, and \(U_i\) is the first 15 principal components of the \(i\)-th point. - **GCN - layer representation learning**: \[ H_1^{(l)}=\varphi_l\left(W^{(l)}H_1^{(l - 1)}\tilde{D}^{-\frac{1}{2}}\tilde{A}\tilde{D}^{-\frac{1}{2}}+b^{(l)}\right) \] where \(\varphi_l\) is the activation function of the \(l\)-th layer of GCN, \(W^{(l)}\) is the weight matrix, \(b^{(l)}\) is the bias term, \(\tilde{A}=A + I\) is the normalized adjacency matrix, and \(\tilde{D}_{ii}=\sum_j\tilde{A}_{ij}\). - **MMD loss function**: \[ \text{Loss}_{\text{align D}}=\sum_{l = 1}^K\left[\text{MMD}(H_2^{(l)}, H_1^{(l)})+\text{MMD}(H_2^{(l)}, H_3^{(l)})\right] \] \[ \text{MMD}(X, Y)=\left\|\frac{1}{n}\sum_{i = 1}^n\varphi(x_i)-\frac{1}{n}\sum_{j = 1}^n\varphi(y_j)\right\| \]