Segmentation aware probabilistic phenotyping of single-cell spatial protein expression data

Yuju Lee,Edward L. Y. Chen,Darren C. H. Chan,Anuroopa Dinesh,Somaieh Afiuni-Zadeh,Conor Klamann,Alina Selega,Miralem Mrkonjic,Hartland W. Jackson,Kieran R. Campbell
DOI: https://doi.org/10.1101/2024.02.29.582827
2024-04-03
Abstract:Spatial protein expression technologies can map cellular content and organization by simultaneously quantifying the expression of >40 proteins at subcellular resolution within intact tissue sections and cell lines. However, necessary image segmentation to single cells is challenging and error prone, easily confounding the interpretation of cellular phenotypes and cell clusters. To address these limitations, we present STARLING, a novel probabilistic machine learning model designed to quantify cell populations from spatial protein expression data while accounting for segmentation errors. To evaluate performance we developed a comprehensive benchmarking workflow by generating highly multiplexed imaging data of cell line pellet standards with controlled cell content and marker expression and additionally established a novel score to quantify the biological plausibility of discovered cellular phenotypes on patient derived tissue sections. Moreover, we generate spatial expression data of the human tonsil – a densely packed tissue prone to segmentation errors – and demonstrate cellular states captured by STARLING identify known cell types not visible with other methods and enable quantification of intra- and inter- individual heterogeneity. STARLING is available at .
Bioinformatics
What problem does this paper attempt to address?