Operate a Cell-Free Biofoundry using Large Language Models

Joan Herisson,Ngoc An Hoang,Aisha El Sawah,Mostafa Mahdy Khalil,Jean-Loup Faulon
DOI: https://doi.org/10.1101/2024.10.28.619828
2024-10-28
Abstract:In this paper, we present a novel approach to optimizing cell-free protein synthesis (CFPS) systems using artificial intelligence (AI), specifically leveraging ChatGPT-4 for code generation and active learning (AL). This study aims to automate and enhance the process of producing antimicrobial proteins, namely colicin M and colicin E1, in CFPS systems. We developed an automated workflow that employs an iterative Design-Build-Test-Learn (DBTL) cycle, integrating a newly implemented AL method with cluster margin (CM) selection to efficiently explore experimental conditions. The workflow components, including modules for sampling, plate design, instruction generation, and data analysis, were coded using ChatGPT-4 without further human modification. By employing this automated approach, significant improvements in protein yields were achieved, with a 9-fold increase for colicin M and a 3-fold increase for colicin E1 compared to standard buffer compositions. The use of LLMs in conjunction with AL demonstrated the potential of AI-driven methodologies to accelerate the optimization of complex biological processes and reduce manual intervention. The study also discusses limitations such as variability in CFPS and suggests future improvements in automation, reproducibility, and integration of diverse liquid handling systems to further enhance the scalability and efficiency of cell-free biofoundries.
Synthetic Biology
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is to optimize the cell - free protein synthesis (CFPS) system in order to increase the yield of antibacterial proteins such as colicin M and colicin E1. Specifically, the author introduced artificial intelligence (AI), especially using ChatGPT - 4 for code generation and active learning (AL), to automate and enhance this process. ### Main Problems and Solutions 1. **Increasing Protein Yield** - **Problem**: The traditional CFPS system has the problem of low yield when producing specific proteins. - **Solution**: An automated workflow based on the iterative design - build - test - learn (DBTL) cycle was developed, combined with a new active learning method - Cluster Margin (CM) selection, to efficiently explore experimental conditions, thus significantly increasing protein yield. For colicin M, the yield was increased by 9 times; for colicin E1, the yield was increased by 3 times. 2. **Automation and Reducing Manual Intervention** - **Problem**: Traditional methods require a large amount of manual operation and adjustment, which are inefficient and error - prone. - **Solution**: By using ChatGPT - 4 to automatically generate all computational codes and run these codes without further manual modification, a highly automated CFPS system optimization process was achieved. This not only reduces manual intervention but also improves the repeatability and efficiency of the experiment. 3. **Optimization of Complex Biological Processes** - **Problem**: The optimization of the CFPS system involves multiple complex variables, and it is difficult to find the global optimal solution by traditional methods. - **Solution**: Using AI - driven methods, especially active learning techniques, can find a component combination close to the global optimum within a limited number of experiments, thus accelerating the optimization of complex biological processes. ### Conclusion By combining AI and active learning techniques, this study shows how to effectively improve the performance of the CFPS system, providing new ideas and technical means for the future development of bio - manufacturing and synthetic biology. In addition, this study also discusses the limitations of the current method, such as the variability of the CFPS system, and proposes directions for future improvement, including increasing the level of automation, enhancing repeatability, and integrating more diverse liquid handling systems.