Optimisation of surfactin yield in Bacillus using data-efficient active learning and high-throughput mass spectrometry

Ricardo Valencia Albornoz,Diego Oyarzún,Karl Burgess
DOI: https://doi.org/10.1016/j.csbj.2024.02.012
IF: 6.155
2024-02-17
Computational and Structural Biotechnology Journal
Abstract:Integration of machine learning and high throughput measurements are essential to drive the next generation of the design-build-test-learn (DBTL) cycle in synthetic biology. Here, we report the use of active learning in combination with metabolomics for optimising production of surfactin, a complex lipopeptide resulting from a non-ribosomal assembly pathway. We designed a media optimisation algorithm that iteratively learns the yield landscape and steers the media composition toward maximal production. The algorithm led to a 160% yield increase after three DBTL runs as compared to an M9 baseline. Metabolomics data helped to elucidate the underpinning biochemistry for yield improvement and revealed Pareto-like trade-offs in production of other lipopeptides from related pathways. We found positive associations between organic acids and surfactin, suggesting a key role of central carbon metabolism, as well as system-wide anisotropies in how metabolism reacts to shifts in carbon and nitrogen levels. Our framework offers a novel data-driven approach to improve yield of biological products with complex synthesis pathways that are not amenable to traditional yield optimisation strategies. Graphical abstract Download : Download high-res image (234KB) Download : Download full-size image
biochemistry & molecular biology
What problem does this paper attempt to address?