Bin2cell reconstructs cells from high resolution Visium HD data

Krzysztof Polanski,Raquel Bartolome-Casado,Ioannis Sarropoulos,Chuan Xu,Nick England,Frode L Jahnsen,Sarah A Teichmann,Nadav Yayon
DOI: https://doi.org/10.1101/2024.06.19.599766
2024-06-22
Abstract:Summary: Visium HD by 10X Genomics is the first commercially available platform capable of capturing full scale transcriptomic data paired with a reference morphology image from archived FFPE blocks at sub-cellular resolution. However, aggregation of capture regions to single cells poses challenges. Bin2cell reconstructs cells from the highest resolution data (2 μm bins) by leveraging morphology image segmentation and gene expression information. It is compatible with established Python single cell and spatial transcriptomics software, and operates efficiently in a matter of minutes without requiring a GPU. We demonstrate improvements in downstream analysis when using the reconstructed cells over default 8 μm bins on mouse brain and human colorectal cancer data. Availability and Implementation: Bin2cell is available at https://github.com/Teichlab/bin2cell, along with documentation and usage examples, and can be installed from pip. Probe design functionality is available at https://github.com/Teichlab/gene2probe Supplementary information: Supplementary data are available online.
Bioinformatics
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: how to reconstruct single cells from high - resolution Visium HD data in order to improve the accuracy and efficiency of spatial transcriptomics analysis. Specifically, the Visium HD platform can capture complete transcriptome data from archived FFPE (formalin - fixed paraffin - embedded) tissue blocks and be paired with reference morphological images, providing sub - cellular resolution data. However, aggregating the capture area into single cells is challenging. To this end, the researchers developed the **Bin2cell** tool, which uses morphological image segmentation and gene expression information to reconstruct cells from the highest - resolution data (2 μm bins). This method not only improves data resolution but also enhances the effect of downstream analysis, especially when using mouse brain and human colorectal cancer data. Compared with the default 8 μm bins, Bin2cell shows significant advantages in the accuracy of cell location and distribution. ### Main contributions: 1. **Improve resolution**: Bin2cell more accurately identifies single cells by using high - resolution data of 2 μm bins and combining with morphological image segmentation technology. 2. **Compatible with existing tools**: Bin2cell is compatible with existing Python single - cell and spatial transcriptomics software (such as SCANPY), which is convenient for integration into existing analysis workflows. 3. **Efficient operation**: Bin2cell can complete the analysis within a few minutes and does not require a GPU, reducing hardware requirements. 4. **Improve downstream analysis**: Through the reconstructed cells, Bin2cell shows better performance in the downstream analysis of mouse brain and human colorectal cancer data, especially in terms of the accuracy of cell location and distribution. ### Technical details: - **Morphological image segmentation**: Use the StarDist algorithm to identify nuclei in H&E images and expand to adjacent unlabeled bins. - **Gene expression information**: Use gene expression data for secondary segmentation to identify those nuclei that could not be detected in H&E images. - **Variable bin - size correction**: Correct the inconsistent size of 2 μm bins caused by technical reasons and reduce the "streaking" effect. - **Custom probe design**: Provide an additional workflow for creating custom probes to enhance the default panel. ### Experimental verification: - **Mouse brain data**: Bin2cell shows higher accuracy of cell location and distribution in mouse brain data, especially in the hippocampus area. - **Human colorectal cancer data**: In human colorectal cancer data, Bin2cell not only improves the confidence of cell - type prediction but also better reconstructs fine morphological structures such as venous and arterial layers. In conclusion, this paper solves the key problems in high - resolution spatial transcriptomics data analysis by developing the Bin2cell tool, providing more accurate and efficient means for biological research.