Abstract:The continued scaling of genetic perturbation technologies combined with high-dimensional assays such as cellular microscopy and RNA-sequencing has enabled genome-scale reverse-genetics experiments that go beyond single-endpoint measurements of growth or lethality. Datasets emerging from these experiments can be combined to construct perturbative "maps of biology", in which readouts from various manipulations (e.g., CRISPR-Cas9 knockout, CRISPRi knockdown, compound treatment) are placed in unified, relatable embedding spaces allowing for the generation of genome-scale sets of pairwise comparisons. These maps of biology capture known biological relationships and uncover new associations which can be used for downstream discovery tasks. Construction of these maps involves many technical choices in both experimental and computational protocols, motivating the design of benchmark procedures to evaluate map quality in a systematic, unbiased manner. Here, we (1) establish a standardized terminology for the steps involved in perturbative map building, (2) introduce key classes of benchmarks to assess the quality of such maps, (3) construct 18 maps from four genome-scale datasets employing different cell types, perturbation technologies, and data readout modalities, (4) generate benchmark metrics for the constructed maps and investigate the reasons for performance variations, and (5) demonstrate utility of these maps to discover new biology by suggesting roles for two largely uncharacterized genes. Due to the rapid advancements in genetic perturbation, laboratory robotics, sequencing, and computer vision, more researchers are now generating datasets that capture cellular responses to genetic perturbations. These datasets can be powerful discovery tools for examining known biological relationships and revealing new associations in an unbiased manner when paired with a computational pipeline that can assemble the data into a digestible format. However, the challenge arises from the variety of cellular models, assay designs, terminologies, codebases, and analysis methods involved. In this work we define a unified framework for building and benchmarking perturbative maps, benchmark four different datasets assembled into 18 different maps, explore the impact of different design decisions, and demonstrate how these maps can be used to elucidate gene functions. Our goal is to facilitate comparisons across various technologies and methods by introducing a shared language for the field. The open-source codebase, capable of incorporating new methods, aims to be a resource for researchers developing laboratory or computational methodology. While we caution against definitive recommendations due to numerous variables at play, we hope to stimulate studies directly comparing methods under controlled conditions. Our framework can also help evaluate combining maps across modalities as the field progresses.

PerturBench: Benchmarking Machine Learning Models for Cellular Perturbation Analysis

A Systematic Comparison of Single-Cell Perturbation Response Prediction Models

Benchmarking a foundational cell model for post-perturbation RNAseq prediction

Benchmarking AI Models for In Silico Gene Perturbation of Cells

Benchmarking Transcriptomics Foundation Models for Perturbation Analysis : one PCA still rules them all

Machine learning for perturbational single-cell omics

PertEval-scFM: Benchmarking Single-Cell Foundation Models for Perturbation Effect Prediction

Building, benchmarking, and exploring perturbative maps of transcriptional and morphological data

Predicting cell morphological responses to perturbations using generative modeling

Predicting Cellular Responses to Complex Perturbations in High-Throughput Screens.

Scgen Predicts Single-Cell Perturbation Responses

Perturbation Biology: Inferring Signaling Networks in Cellular Systems

PerturBase: a comprehensive database for single-cell perturbation data analysis and visualization

A mini-review on perturbation modelling across single-cell omic modalities

The CausalBench challenge: A machine learning contest for gene network inference from single-cell perturbation data

Learning interpretable cellular responses to complex perturbations in high-throughput screens

Decoding Heterogenous Single-cell Perturbation Responses

Generative modeling and latent space arithmetics predict single-cell perturbation response across cell types, studies and species

A test metric for assessing single-cell RNA-seq batch correction