Abstract:Abstract Stochastic individual-based mathematical models are attractive for modelling biological phenomena because they naturally capture the stochasticity and variability that is often evident in biological data. Such models also allow us to track the motion of individuals within the population of interest. Unfortunately, capturing this microscopic detail means that simulation and parameter inference can become computationally expensive. One approach for overcoming this computational limitation is to coarse-grain the stochastic model to provide an approximate continuum model that can be solved using far less computational effort. However, coarse-grained continuum models can be biased or inaccurate, particularly for certain parameter regimes. In this work, we combine stochastic and continuum mathematical models in the context of lattice-based models of two-dimensional cell biology experiments by demonstrating how to simulate two commonly used experiments: cell proliferation assays and barrier assays . Our approach involves building a simple statistical model of the discrepancy between the expensive stochastic model and the associated computationally inexpensive coarse-grained continuum model. We form this statistical model based on a limited number of expensive stochastic model evaluations at design points sampled from a user-chosen distribution, corresponding to a computer experiment design problem. With straightforward design point selection schemes, we show that using the statistical model of the discrepancy in tandem with the computationally inexpensive continuum model allows us to carry out prediction and inference while correcting for biases and inaccuracies due to the continuum approximation. We demonstrate this approach by simulating a proliferation assay , where the continuum limit model is the well-known logistic ordinary differential equation, as well as a barrier assay where the continuum limit model is closely related to the well-known Fisher-KPP partial differential equation. We construct an approximate likelihood function for parameter inference, both with and without discrepancy correction terms. Using maximum likelihood estimation, we provide point estimates of the unknown parameters, and use the profile likelihood to characterise the uncertainty in these estimates and form approximate confidence intervals. For the range of inference problems considered, working with the continuum limit model alone leads to biased parameter estimation and confidence intervals with poor coverage. In contrast, incorporating correction terms arising from the statistical model of the model discrepancy allows us to recover the parameters accurately with minimal computational overhead. The main tradeoff is that the associated confidence intervals are typically broader, reflecting the additional uncertainty introduced by the approximation process. All algorithms required to replicate the results in this work are written in the open source Julia language and are available at GitHub.

Sequential design of single-cell experiments to identify discrete stochastic models for gene expression

Parameterizing Cell-to-cell Regulatory Heterogeneities Via Stochastic Transcriptional Profiles

Multi-Experiment Nonlinear Mixed Effect Modeling of Single-Cell Translation Kinetics after Transfection

Designing Experiments to Understand the Variability in Biochemical Reaction Networks

Interpretable and tractable models of transcriptional noise for the rational design of single-molecule quantification experiments

Inferring gene regulatory networks from single-cell data: a mechanistic approach

A sequential Monte Carlo approach to gene expression deconvolution

Inferring Population Dynamics from Single-Cell RNA-sequencing Time Series Data.

A deep generative model for single-cell RNA sequencing with application to detecting differentially expressed genes

The finite state projection based Fisher information matrix approach to estimate information and optimize single-cell experiments

Data Exploration, Quality Control and Testing in Single-Cell qPCR-Based Gene Expression Experiments

Parameter inference for stochastic biochemical models from perturbation experiments parallelised at the single cell level

scDesign2: a transparent simulator that generates high-fidelity single-cell gene expression count data with gene correlations captured

Deciphering regulatory architectures from synthetic single-cell expression patterns

Design and power analysis for multi-sample single cell genomics experiments

Synthesising Executable Gene Regulatory Networks from Single-cell Gene Expression Data

Stochasticity or Noise in Biochemical Reactions

scDesign3 generates realistic in silico data for multimodal single-cell and spatial omics

Integrated Pipelines for Inferring Gene Regulatory Networks from Single-Cell Data

Reliable and efficient parameter estimation using approximate continuum limit descriptions of stochastic models

Advanced methods for gene network identification and noise decomposition from single-cell data