Abstract:The task of spatial clustering of transcriptomics data is of paramount importance. It enables the classification of tissue samples into diverse subpopulations of cells, which, in turn, facilitates the analysis of the biological functions of clusters, tissue reconstruction, and cell-cell interactions. Many approaches leverage gene expressions, spatial locations, and histological images to detect spatial domains; however, Graph Neural Networks (GNNs) as state of the art models suffer from a limitation in the assumption of pairwise connections between nodes. In the case of domain detection in spatial transcriptomics, some cells are found to be not directly related. Still, they are grouped as the same domain, which shows the incapability of GNNs for capturing implicit connections among the cells. While graph edges connect only two nodes, hyperedges connect an arbitrary number of nodes along their edges, which lets Hypergraph Neural Networks (HGNNs) capture and utilize richer and more complex structural information than traditional GNNs. We use autoencoders to address the limitation of not having the actual labels, which are well-suited for unsupervised learning. Our model has demonstrated exceptional performance, achieving the highest iLISI score of 1.843 compared to other methods. This score indicates the greatest diversity of cell types identified by our method. Furthermore, our model outperforms other methods in downstream clustering, achieving the highest ARI values of 0.51 and Leiden score of 0.60.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is: how to reveal spatial domains from single - cell transcriptome data, that is, by analyzing spatial transcriptome data to classify tissue samples into different cell sub - populations, so as to better understand the spatial relationships between cells, biological functions and cell - cell interactions. ### Background and Challenges of the Problem 1. **Importance of Spatial Transcriptome Data** Spatial transcriptome data can help us identify new cell types and provide a better understanding of biological processes in the tissue microenvironment. Such data can be obtained through imaging techniques and sequencing techniques. Each technique has its own advantages and disadvantages, but all generate multi - modal, multi - scale and high - resolution data. 2. **Limitations of Existing Methods** - Many existing methods rely on gene expression, spatial location and histological images to detect spatial domains, but these methods are insufficient in capturing the internal relationships or spatial dependencies of two - dimensional features. - Graph Neural Networks (GNNs), as the state - of - the - art models, assume pairwise connections between nodes and cannot capture some of the implicit complex relationships between cells. For example, in the field of spatial transcriptome, some cells are still in the same domain although they are not directly related, which indicates the limitations of GNNs in capturing implicit connections between cells. 3. **Research Objectives** - This paper aims to solve the above problems by introducing Hypergraph Neural Networks (HGNNs). HGNNs can capture higher - order relationships because each hyper - edge can connect any number of nodes, thus encoding richer structural information. - At the same time, since the actual labels are difficult to obtain, the authors use autoencoders for unsupervised learning to detect patterns and anomalies. ### Overview of the Method - **Hypergraph Construction**: Use the K - Nearest Neighbor algorithm to capture the nearest cells and group them into hyper - edges to form a hypergraph structure. - **Autoencoder**: Used to generate the latent representation of gene expression to handle unlabeled data. - **Hypergraph Neural Network**: Learn node representations through a two - step message - passing process (vertex - hyperedge - vertex) to capture the complex relationships between cells. - **Optimization and Evaluation**: Optimize the model by minimizing the error between the reconstructed similarity matrix and the original adjacency matrix, and use metrics such as iLISI and ARI to evaluate the model performance. ### Experimental Results This model performs well on multiple evaluation metrics. For example, it obtains the highest iLISI score (1.843), indicating that it can identify the most diverse cell types. In addition, it also performs excellently in downstream clustering tasks, obtaining the highest ARI value (0.51) and Leiden score (0.60). In conclusion, this paper solves the shortcomings of existing methods in capturing complex relationships between cells by introducing hypergraph neural networks, providing new ideas and technical means for revealing spatial domains from single - cell transcriptome data.

Hypergraph Neural Networks Reveal Spatial Domains from Single-cell Transcriptomics Data

A comprehensive overview of graph neural network-based approaches to clustering for spatial transcriptomics

Cell Clustering for Spatial Transcriptomics Data with Graph Neural Networks

STGNNks: Identifying cell types in spatial transcriptomics data based on graph neural network, denoising auto-encoder, and k-sums clustering

HyperGCN: an effective deep representation learning framework for the integrative analysis of spatial transcriptomics data

Cross-view graph neural networks for spatial domain identification by integrating gene expression, spatial locations with histological images

Graph deep learning enabled spatial domains identification for spatial transcriptomics

Graph Neural Operators for Classification of Spatial Transcriptomics Data

Spatial-MGCN: a novel multi-view graph convolutional network for identifying spatial domains with attention mechanism

Assembling spatial clustering framework for heterogeneous spatial transcriptomics data with GRAPHDeep

STdGCN: spatial transcriptomic cell-type deconvolution using graph convolutional networks

Unraveling spatial domain characterization in spatially resolved transcriptomics with robust graph contrastive clustering

Reinventing gene expression connectivity through regulatory and spatial structural empowerment via principal node aggregation graph neural network

DGSIST: Clustering spatial transcriptome data based on deep graph structure Infomax

Deciphering tissue heterogeneity from spatially resolved transcriptomics by the autoencoder-assisted graph convolutional neural network

Spatial Domain Identifying: Graph Attention Network with Two Different Decoders

Gene expression prediction from histology images via hypergraph neural networks

Identifying spatial domains of spatially resolved transcriptomics via multi-view graph convolutional networks

MNMST: topology of cell networks leverages identification of spatial domains from spatial transcriptomics data

CCST: Cell clustering for spatial transcriptomics data with graph neural network

HGNN$^+$: General Hypergraph Neural Networks