Distribution shapes govern the discovery of predictive models for gene regulation

Brian Munsky,Guoliang Li,Zachary R. Fox,Douglas P. Shepherd,Gregor Neuert
DOI: https://doi.org/10.1073/pnas.1804060115
IF: 11.1
2018-06-29
Proceedings of the National Academy of Sciences
Abstract:Significance Systems biology seeks to combine experiments with computation to predict biological behaviors. However, despite tremendous data and knowledge, biological models make less-accurate predictions compared with other fields. By analyzing single-cell, single-molecule measurements of mRNA during yeast stress response, we explore why and how the shapes of experimental distributions control prediction accuracy. We show how asymmetric data distributions with long tails cause standard modeling approaches to yield excellent fits but make meaningless predictions. We show how these biases arise from the violation of fundamental assumptions in standard modeling approaches. We demonstrate how advanced computational tools solve this dilemma and achieve predictive understanding of spatiotemporal mechanisms of transcription control including RNA polymerase initiation and elongation and mRNA accumulation, transport, and decay.
multidisciplinary sciences
What problem does this paper attempt to address?