A framework for fully autonomous design of materials via multiobjective optimization and active learning: challenges and next steps

Tyler H. Chang,Jakob R. Elias,Stefan M. Wild,Santanu Chaudhuri,Joseph A. Libera
2023-04-15
Abstract:In order to deploy machine learning in a real-world self-driving laboratory where data acquisition is costly and there are multiple competing design criteria, systems need to be able to intelligently sample while balancing performance trade-offs and constraints. For these reasons, we present an active learning process based on multiobjective black-box optimization with continuously updated machine learning models. This workflow is built on open-source technologies for real-time data streaming and modular multiobjective optimization software development. We demonstrate a proof of concept for this workflow through the autonomous operation of a continuous-flow chemistry laboratory, which identifies ideal manufacturing conditions for the electrolyte 2,2,2-trifluoroethyl methyl carbonate.
Machine Learning
What problem does this paper attempt to address?
The paper aims to address the issues of automated laboratory design and operation in the field of materials science, particularly how to intelligently sample data through multi-objective optimization and active learning in a real self-driving laboratory environment to balance performance trade-offs and make decisions under limited and expensive data acquisition conditions. Specifically, the researchers proposed an active learning process based on multi-objective black-box optimization, combined with real-time data stream technology and modular multi-objective optimization software development, to tackle the challenges of data collection and utilization in complex chemical processes in the real world. This method was validated through a proof-of-concept experiment in a continuous flow chemistry laboratory, successfully identifying the optimal manufacturing conditions for the electrolyte 2,2,2-trifluoroethyl methyl carbonate (TFMC). This work primarily addresses the following issues: 1. **Effective Data Collection**: How to effectively collect and utilize data in complex experimental processes. 2. **Multi-Objective Balance**: How to balance different objectives in the presence of multiple competing design criteria. 3. **Autonomous Experiment Design**: How to achieve automated design of laboratory experiments by integrating machine learning models and optimization algorithms. 4. **Resource-Efficient Utilization**: How to maximize experimental efficiency under limited resources. By integrating open-source technologies (such as the ParMOO and MDML platforms), the researchers achieved a closed-loop process from data collection to experiment design, thereby accelerating the process of molecular synthesis and materials discovery. Additionally, this framework demonstrates the potential for future expansion to a broader range of materials discovery fields.