Inference of heterogeneous effects in single-cell genetic perturbation screens

Zichu Fu,Lin Hou
DOI: https://doi.org/10.1101/2024.05.28.596224
2024-06-02
Abstract:Recent single-cell CRISPR screening experiments have combined the advances of genetic editing and single-cell technologies, leading to transcriptome-scale readouts of responses to perturbations at single-cell resolution. An outstanding question is how to efficiently identify heterogeneous causal effects of perturbations using these technologies. Here we present scCAPE, a tool designed to facilitate causal analysis of heterogeneous perturbation effects at the single-cell level. scCAPE disentangles perturbation effects from the inherent cell-state variations and provides nonparametric inferences of perturbation effects at single-cell resolution, permitting a range of downstream tasks including perturbation effect analysis, genetic interaction analysis, perturbation clustering and prioritizing. We benchmarked scCAPE through simulation studies and real datasets to evaluate its performance in characterizing latent confounding factors and accuracy in estimating heterogeneous perturbation effects. The application of scCAPE identified novel heterogeneous genetic interactions among erythroid differentiation drivers. For example, our analysis pinpointed the role of the synergistic interaction between CBL and CNN1 in the S phase.
Bioinformatics
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: how to efficiently identify heterogeneous causal effects in single - cell gene - editing perturbation experiments. Specifically, the research aims to develop a tool that can analyze the heterogeneous impacts of gene perturbations at single - cell resolution and separate perturbation effects from inherent cell - state changes, thereby providing support for downstream tasks such as gene - interaction analysis, perturbation clustering, and prioritization. ### Problem Background In recent years, single - cell CRISPR screening technology, which combines the advantages of gene editing and single - cell sequencing, enables researchers to obtain transcriptome - scale perturbation - response readings at the single - cell level. However, how to effectively identify heterogeneous causal effects in these data remains a challenge. Traditional pseudo - bulk methods assume that all cells respond homogeneously to perturbations, which ignores the potential heterogeneous landscape and limits in - depth understanding of regulatory mechanisms. ### Research Objectives To address this challenge, the authors propose scCAPE (Single - Cell Causal Analysis of Perturbation Effects), a framework specifically designed for causal analysis at the single - cell level. The main objectives of scCAPE include: 1. **Analyze Heterogeneous Perturbation Effects**: Estimate perturbation effects at single - cell resolution through non - parametric inference. 2. **Separate Perturbation Effects from Cell - State Variation**: Isolate causal changes caused by perturbations from the transcriptome profile, excluding the influence of other factors. 3. **Provide Rich Downstream Applications**: Including gene - interaction analysis, perturbation clustering, and prioritization, etc. ### Solutions scCAPE achieves its objectives through the following steps: 1. **Autoencoder and Adversarial Training**: Use an autoencoder with adversarial training to decouple perturbation effects and inherent cell - state changes, generating a "ground - state space" that does not contain perturbation information but retains the cell's intrinsic state. 2. **Causal Forest Estimation**: Based on the ground - state space, use the causal forest to iteratively split cells, maximizing the heterogeneity of perturbation effects, thereby estimating the perturbation effect of each cell and its asymptotic variance. 3. **Statistical Testing and Functional Annotation**: Conduct statistical tests on the estimated perturbation effects to determine their significance, and perform functional annotation through biomarker association and gene - set enrichment analysis. ### Application Examples The application of scCAPE on simulated data and real - world datasets demonstrates its powerful performance. For example, in datasets of human CD8+ T cells and K562 cells, scCAPE reveals the heterogeneous perturbation landscapes of T - cell regulators and cell - cycle regulators, and discovers new gene - interactions, such as the synergy between CBL and CNN1 in the S - phase. In conclusion, scCAPE provides a novel and effective method that can analyze the heterogeneous effects of gene perturbations at the single - cell level, providing a powerful tool for in - depth understanding of gene - regulation mechanisms.