Machine Learning-Based Recommendation of Optimal Crystallization Conditions for Organic Small Molecules

Zi Li,Wenbo Fu,Bochen Li,Jia Yao,Jiuchuang Yuan,Michael Bellucci,Guangxu Sun,Zhengtian Song,Shi Liu,Zhu Lang,Jian Ma,Shuhao Wen,Qun Zeng
DOI: https://doi.org/10.26434/chemrxiv-2024-5w5rp
2024-07-16
Abstract:Crystallization is an important process in a broad range of industries, though studies on this topic remain complicated. Recently, machine learning has been applied to resolve complex issues in chemistry and material science. Here we present a machine learning model to propose crystallization experiments for organic small molecules. This model has been integrated into a robotic platform that performs experiments automatically. To improve applicability and accuracy, the model was trained on both simulated and experimental data. In comparative case studies, polymorph screening experiments by the platform yielded a high rate of solid products, and the number of forms obtained by the platform equaled those obtained by human researchers. The model provides a data-based perspective of the promoting and inhibiting influences to crystallization from molecular and interaction features. This work demonstrates the feasibility of applying machine learning techniques to solid-state studies to boost efficiency and deepen understanding.
Chemistry
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the complexity and inefficiency in the optimization of crystallization conditions for small organic molecules and polymorph screening. Specifically, the crystallization process plays a crucial role in material discovery and development. Especially in the pharmaceutical industry, the polymorphism of active pharmaceutical ingredients (APIs) is very common, and the selection of an appropriate solid form has a significant impact on drug performance and manufacturability. However, the crystallization process in solution is affected by multiple factors, and it is difficult to quantify the individual contribution of each variable, resulting in traditional polymorph screening experiments being both time - consuming and experience - dependent. To address these issues, the authors developed a machine - learning (ML) - based model for recommending the optimal crystallization conditions for small organic molecules and integrated it into an automated experimental platform. The model overcomes the limitations of existing methods in the following ways: 1. **Data - driven crystallization condition prediction**: Train the machine - learning model by combining simulation data and experimental data to improve the accuracy and applicability of the model. 2. **Automated experimental platform**: Integrate the machine - learning model with a custom - made workstation to achieve automated crystallization experiments, thereby accelerating the experimental process and reducing human error. 3. **Efficient polymorph screening**: Predict crystallization conditions through the machine - learning model, reducing the number of unnecessary experiments and increasing the probability of discovering new polymorphs. ### Main contributions - **Improve experimental efficiency**: Compared with traditional manual experiments, this platform can complete more experiments in a shorter time and has a higher experimental success rate. - **Reveal the influence of molecular features on crystallization**: Through contribution analysis of the model, certain molecular and interaction features that promote or inhibit crystallization were discovered, providing a new perspective for further understanding the crystallization mechanism. - **Expand the application range**: This intelligent automated platform is not only applicable to evaporation and anti - solvent addition crystallization methods but can also be extended to other crystallization methods, having broad application prospects. Through these improvements, this study demonstrates the effectiveness of applying machine - learning and automation technologies to crystallization research, providing new ideas for the construction of intelligent automated systems in the field of materials chemistry.