A multi-scale expression and regulation knowledge base for Escherichia coli

Cameron R Lamoureux,Katherine T Decker,Anand V Sastry,Kevin Rychel,Ye Gao,John Luke McConn,Daniel C Zielinski,Bernhard O Palsson,John Luke McConn,Daniel C Zielinski
DOI: https://doi.org/10.1093/nar/gkad750
IF: 14.9
2023-09-17
Nucleic Acids Research
Abstract:Transcriptomic data is accumulating rapidly; thus, scalable methods for extracting knowledge from this data are critical. Here, we assembled a top-down expression and regulation knowledge base for Escherichia coli. The expression component is a 1035-sample, high-quality RNA-seq compendium consisting of data generated in our lab using a single experimental protocol. The compendium contains diverse growth conditions, including: 9 media; 39 supplements, including antibiotics; 42 heterologous proteins; and 76 gene knockouts. Using this resource, we elucidated global expression patterns. We used machine learning to extract 201 modules that account for 86% of known regulatory interactions, creating the regulatory component. With these modules, we identified two novel regulons and quantified systems-level regulatory responses. We also integrated 1675 curated, publicly-available transcriptomes into the resource. We demonstrated workflows for analyzing new data against this knowledge base via deconstruction of regulation during aerobic transition. This resource illuminates the E. coli transcriptome at scale and provides a blueprint for top-down transcriptomic analysis of non-model organisms.
biochemistry & molecular biology
What problem does this paper attempt to address?