Multi-ContrastiveVAE disentangles perturbation effects in single cell images from optical pooled screens

Zitong Jerry Wang,Romain Lopez,Jan-Christian Hütter,Takamasa Kudo,Heming Yao,Philipp Hanslovsky,Burkhard Höckendorf,Rahul Moran,David Richmond,Aviv Regev
DOI: https://doi.org/10.1101/2023.11.28.569094
2024-03-19
Abstract:Optical pooled screens (OPS) enable comprehensive and cost-effective interrogation of gene function by measuring microscopy images of millions of cells across thousands of perturbations. However, the analysis of OPS data still mainly relies on hand-crafted features, even though these are difficult to deploy across complex data sets. This is because most unsupervised feature extraction methods based on neural networks (such as auto-encoders) have difficulty isolating the effect of perturbations from the natural variations across cells and experimental batches. Here, we propose a contrastive analysis framework that can more effectively disentangle the phenotypes caused by perturbation from natural cell-cell heterogeneity present in an unperturbed cell population. We demonstrate this approach by analyzing a large data set of over 30 million cells imaged across more than 5, 000 genetic perturbations, showing that our method significantly outperforms traditional approaches in generating biologically-informative embeddings and mitigating technical artifacts. Furthermore, the interpretable part of our model distinguishes perturbations that generate novel phenotypes from the ones that only shift the distribution of existing phenotypes. Our approach can be readily applied to other small-molecule and genetic perturbation data sets with highly multiplexed images, enhancing the efficiency and precision in identifying and interpreting perturbation-specific phenotypic patterns, paving the way for deeper insights and discoveries in OPS analysis.
Bioinformatics
What problem does this paper attempt to address?