Workflows for Artificial Intelligence

Akhil S. Nair,Jörg Behler,Gábor Csányi,Lucas Foppa,Kisung Kang,Marcel F. Langer,Johannes T. Margraf,Thomas A. R. Purcell,Patrick Rinke, Matthias Scheffler,Alexandre Tkatchenko,Milica Todorović,Oliver T. Unke,Yi Yao
DOI: https://doi.org/10.26434/chemrxiv-2024-vw06p
2024-11-13
Abstract:The efficiency and reliability of artificial-intelligence (AI)-driven physics, chemistry, biophysics, materials science and engineering depends on the acquisition of sufficient, high-quality data. Due to its all-electron, full potential treatment, and its scalability to larger systems without precision limitations, FHI-aims provides accurate ab initio data from a wide range of computer simulations, such as electronic structure calculations and molecular dynamics. To leverage the capabilities of AI models, workflows that seamlessly integrate AI tools with FHI-aims are essential. These workflows automate the acquisition of data and their use by AI. Thus, they facilitate the iterative data exchange between AI models and simulations, allowing FHI-aims to be used as a powerful AI-guided calculation engine. Also, interpretable AI models aid in analyzing the generated data. Furthermore, AI complements ab initio studies as it enables to perform simulations at larger time and length scales. In turn, also the AI models must incorporate the physics required for an accurate representation of the ab initio data. This contribution highlights workflows developed to integrate FHI-aims with AI and future challenges.
Chemistry
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to obtain sufficient high - quality data efficiently and reliably in the fields of physics, chemistry, biophysics, materials science and engineering driven by artificial intelligence (AI). Specifically, the paper focuses on the workflow of seamlessly integrating AI tools with FHI - aims (a software package that provides accurate ab - initio data) to automate data acquisition and its application in AI. These workflows are designed to promote the iterative data exchange between AI models and simulations, enabling FHI - aims to be used as a powerful AI - guided computational engine. The paper emphasizes several key points: 1. **Data Quality and Quantity**: In order to improve the efficiency and reliability of AI models, sufficient and high - quality data need to be obtained. FHI - aims can provide accurate ab - initio data because of its all - electron, all - potential treatment and scalability for larger systems. 2. **Complementarity of AI and Traditional Methods**: AI can not only supplement traditional ab - initio research, but also perform simulations on larger time and space scales. Meanwhile, AI models must incorporate the necessary physical knowledge to accurately represent ab - initio data. 3. **Workflow Development**: Several developed workflows for integrating FHI - aims with AI techniques are introduced in the paper, such as the generation of training data for machine - learning interatomic potentials (MLIPs), and new data - acquisition strategies guided by uncertainty estimation. 4. **Challenges and Future Directions**: Although the application of MLIPs is becoming more and more widespread, its reliability in predicting the properties of configurations or chemical species that are significantly different from the training set remains a problem. In addition, for high - dimensional material problems and expensive objective - function evaluations, the sampling requirements can be significantly reduced through the sequential active learning (SAL) workflow, as shown by the Bayesian optimization algorithm. In conclusion, this paper focuses on improving the efficiency and quality of data acquisition in scientific research through the integration of advanced AI technologies and FHI - aims, thereby promoting the progress in the fields of physics, chemistry, biophysics, materials science and engineering.