Spatial Transcriptomics Dimensionality Reduction using Wavelet Bases

Zhuoyan Xu,Kris Sankaran
DOI: https://doi.org/10.48550/arXiv.2205.11243
2022-05-19
Abstract:Spatially resolved transcriptomics (ST) measures gene expression along with the spatial coordinates of the measurements. The analysis of ST data involves significant computation complexity. In this work, we propose gene expression dimensionality reduction algorithm that retains spatial structure. We combine the wavelet transformation with matrix factorization to select spatially-varying genes. We extract a low-dimensional representation of these genes. We consider Empirical Bayes setting, imposing regularization through the prior distribution of factor genes. Additionally, We provide visualization of extracted representation genes capturing the global spatial pattern. We illustrate the performance of our methods by spatial structure recovery and gene expression reconstruction in simulation. In real data experiments, our method identifies spatial structure of gene factors and outperforms regular decomposition regarding reconstruction error. We found the connection between the fluctuation of gene patterns and wavelet technique, providing smoother visualization. We develop the package and share the workflow generating reproducible quantitative results and gene visualization. The package is available at <a class="link-external link-https" href="https://github.com/OliverXUZY/waveST" rel="external noopener nofollow">this https URL</a>.
Genomics,Machine Learning,Applications
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: how to perform dimensionality reduction on spatial transcriptomics data (Spatial Transcriptomics, ST) while preserving the spatial structure. Specifically, the author proposes a gene - expression - dimensionality - reduction algorithm based on wavelet transformation and matrix decomposition, aiming to select genes with spatial variability and extract the low - dimensional representations of these genes. ### Problem Background Spatial Transcriptomics (ST) is a biotechnology that measures gene expression and simultaneously preserves its spatial coordinates. ST data analysis involves significant computational complexity, especially when dealing with high - dimensional gene - expression data. In order to effectively analyze these data, a dimensionality - reduction method that can preserve spatial information is required. ### Core Problems of the Paper 1. **Dimensionality Reduction while Preserving Spatial Structure**: Existing dimensionality - reduction methods usually ignore the spatial information of gene expression, resulting in the inability to accurately capture the spatial distribution patterns of genes. Therefore, this paper proposes a method that combines wavelet transformation and matrix decomposition to preserve the spatial structure during the dimensionality - reduction process. 2. **Improving Reconstruction Accuracy**: By introducing wavelet transformation and empirical Bayes matrix decomposition, the author hopes to better recover the gene - expression patterns under low - signal - to - noise - ratio (SNR) conditions and reduce the influence of noise on the results. 3. **Providing Smoother Visualization**: Traditional dimensionality - reduction methods may produce discontinuous or abnormal pixel points, affecting the visualization effect. The method proposed in this paper can generate smoother and more interpretable gene - expression images through wavelet transformation and threshold shrinkage. ### Method Overview - **Wavelet Transformation**: Convert the spatial - expression matrix of each gene into wavelet coefficients and remove noise through threshold shrinkage. - **Matrix Decomposition**: Perform singular - value decomposition (SVD) or empirical Bayes matrix decomposition (EBMF) on the wavelet - coefficient matrix to extract the low - dimensional representation. - **Inverse Wavelet Transformation**: Convert the low - dimensional representation back to the original spatial - expression matrix for reconstruction and visualization. ### Experimental Verification The author verified the effectiveness of this method through simulation experiments and real - data experiments. The results show that under low - SNR conditions, the wavelet - guided dimensionality - reduction method has better estimation performance than the traditional singular - value decomposition (SVD) and generates smoother gene - expression images. ### Main Contributions 1. Proposed a gene - expression - dimensionality - reduction method that combines wavelet transformation and matrix decomposition, which can perform dimensionality reduction while preserving the spatial structure. 2. Under low - SNR conditions, the wavelet - guided dimensionality - reduction method shows better estimation performance. 3. Through experiments, proved the connection between wavelet technology and gene - expression fluctuations, which is helpful for selecting space - related genes. 4. Developed an R package `waveST`, which provides reproducible quantitative results and a gene - visualization workflow. In conclusion, this paper aims to solve the dimensionality - reduction problem in spatial - transcriptomics data analysis, and has made significant progress especially in preserving the spatial structure and improving the reconstruction accuracy.