Active delta-learning for fast construction of interatomic potentials and stable molecular dynamics simulations

Pavlo O. Dral,Yaohuang Huang,Yi-Fan Hou
DOI: https://doi.org/10.26434/chemrxiv-2024-fb02r
2024-11-19
Abstract:Active learning requires massive time for comprehensive sampling of complex potential energy surfaces (PESs) to achieve desirable accuracy and stability of machine learning (ML) potentials. Here we develop an active delta-learning (ADL) protocol for speeding up active learning and building delta-learning models stable in simulations. Our results show that ADL needs ca. ten times fewer sampled points and iterations to obtain models of the same accuracy as without delta-learning. The crucial advantage of the models built with the delta-learning protocol is their remarkable simulation stability: even models from the initial active learning iterations yield reasonable results. In contrast, the pure ML potentials built without delta-learning often lead to the collapse in simulations, i.e., to unphysical structures.
Chemistry
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is: **How to accelerate the construction of machine - learning (ML) potential functions and improve the stability in molecular dynamics simulations**. Specifically, the authors developed a protocol named **Active Delta - Learning (ADL)** to address the bottlenecks encountered by traditional active learning (AL) in constructing accurate and stable ML potential functions. ### Main problems: 1. **Low data generation efficiency**: Traditional active learning requires a large amount of time and computing resources to comprehensively sample the complex potential energy surface (PES) in order to achieve the required accuracy and stability. 2. **Simulation instability**: Even if a high - precision ML potential function is obtained during the training process, these models are still prone to crashing in actual molecular dynamics simulations, resulting in the emergence of non - physical structures. ### Solutions: To overcome the above problems, the author introduced the concept of **delta - learning**. The core idea of delta - learning is to learn by combining the differences between a fast and approximate quantum chemistry (QC) baseline method and the target QC method, rather than directly learning the target QC method from scratch. #### Specific improvements: - **Reduce sampling points and iteration times**: The ADL protocol only needs about one - tenth of the sampling points and iteration times to obtain a model with the same accuracy as the traditional method. - **Improve simulation stability**: The delta - learning model can produce reasonable simulation results in the early iterations, avoiding the crash problem of traditional ML potential functions in simulations. ### Experimental verification: The authors used the Diels - Alder reaction as a test case to demonstrate the advantages of the ADL protocol in two different reaction systems: - **Acetylene + 1,3 - butadiene** - **C60+ 2,3 - dimethyl - 1,3 - butadiene** The experimental results show that the ADL protocol not only significantly reduces the required sampling points and iteration times, but also ensures that the models in all iterations can produce stable and reasonable simulation results. ### Conclusion: By combining the fast and approximate baseline method and the target QC method, the ADL protocol effectively solves the data generation efficiency and simulation stability problems of traditional active learning in constructing ML potential functions, greatly shortens the model convergence time, and improves the reliability of the model in practical applications. --- Hope this summary can help you understand the main research content and innovation points of this paper. If you have more questions or need further information, please feel free to ask!