Cell Simulation as Cell Segmentation

Daniel C. Jones,Anna E. Elz,Azadeh Hadadianpour,Heeju Ryu,David R. Glass,Evan W. Newell
DOI: https://doi.org/10.1101/2024.04.25.591218
2024-07-09
Abstract:Single-cell spatial transcriptomics promises a highly detailed view of a cell's transcriptional state and microenvironment, yet inaccurate cell segmentation can render this data murky by misattributing large numbers of transcripts to nearby cells or conjuring nonexistent cells. We adopt methods from ab initio cell simulation to rapidly infer morphologically plausible cell boundaries that preserve cell type heterogeneity. Benchmarking applied to datasets generated by three commercial platforms show superior performance and computational efficiency of this approach compared with existing methods. We show that improved accuracy in cell segmentation aids greatly in detection of difficult to accurately segment tumor infiltrating immune cells such as neutrophils and T cells. Lastly, through improvements in our ability to delineate subsets of tumor infiltrating T cells, we show that CXCL13-expressing CD8+ T cells tend to be more closely associated with tumor cells than their CXCL13-negative counterparts in data generated from renal cell carcinoma patient samples.
Bioinformatics
What problem does this paper attempt to address?
The paper primarily addresses a key issue in single-cell spatial transcriptomics technology—data interpretation bias caused by inaccuracies in cell segmentation. The authors propose a new method called Proseg (Probabilistic Segmentation), an unsupervised approach based on a probabilistic model to improve the accuracy of cell boundary delineation. Specifically, the paper tackles the following core issues: 1. **Challenges in Cell Segmentation**: Current cell segmentation techniques, especially image-based methods, have inherent problems such as poor generalization to different samples and experimental conditions, loss of 3D information, and difficulty in determining cell boundaries. These issues lead to the misassignment of transcripts (RNA molecules) to neighboring cells, thereby affecting the accuracy of data analysis. 2. **Proseg Method**: To overcome these limitations, the authors developed the Proseg method. Inspired by Cellular Potts Models (CPMs), this method uses a probabilistic model to simulate cell morphology and optimizes these simulations to better explain the observed transcript distribution. This approach not only more accurately infers cell boundaries but also relocates transcripts that are unreasonably positioned or misplaced due to technical errors. 3. **Performance Evaluation**: By benchmarking datasets from three commercial platforms (Vizgen MERSCOPE, NanoString CosMx, and 10X Xenium), the authors demonstrate Proseg's advantages in reducing transcript misassignment, improving cell type identification accuracy, and operational efficiency. Notably, in lung cancer samples, Proseg more accurately distinguishes hard-to-segment immune cell types, such as neutrophils and T cells. 4. **Clinical Application Example**: Finally, by analyzing sample data from renal cell carcinoma patients, the authors show that the improved cell segmentation method using Proseg can reveal a closer spatial association between CXCL13-expressing CD8+ T cells and tumor cells, which may be significant for understanding immune responses in the tumor microenvironment. In summary, this study aims to address the limitations of existing technologies by introducing the innovative Proseg cell segmentation algorithm, thereby enhancing the quality and reliability of single-cell spatial transcriptomics data analysis.