De novo exploration and self-guided learning of potential-energy surfaces

Noam Bernstein,Gábor Csányi,Volker L. Deringer
DOI: https://doi.org/10.1038/s41524-019-0236-6
2019-05-25
Abstract:Interatomic potential models based on machine learning (ML) are rapidly developing as tools for materials simulations. However, because of their flexibility, they require large fitting databases that are normally created with substantial manual selection and tuning of reference configurations. Here, we show that ML potentials can be built in a largely automated fashion, exploring and fitting potential-energy surfaces from the beginning (de novo) within one and the same protocol. The key enabling step is the use of a configuration-averaged kernel metric that allows one to select the few most relevant structures at each step. The resulting potentials are accurate and robust for the wide range of configurations that occur during structure searching, despite only requiring a relatively small number of single-point DFT calculations on small unit cells. We apply the method to materials with diverse chemical nature and coordination environments, marking a milestone toward the more routine application of ML potentials in physics, chemistry, and materials science.
Materials Science,Computational Physics
What problem does this paper attempt to address?
### What problem does this paper attempt to solve? This paper aims to solve the problem of how to automatically construct machine - learning (ML) potential functions to explore and fit the potential energy surface (PES) of materials without a large amount of manual selection and adjustment of reference configurations. Specifically, the authors propose a de novo exploration and self - guided learning method for the potential energy surface, enabling the ML potential function to automatically select the most relevant structures without prior knowledge and improve the accuracy and robustness of the model through iterative improvement. #### Main problems: 1. **Reduce manual workload**: In traditional methods, constructing ML potential functions requires a large amount of manual selection and adjustment of reference configurations. This is not only time - consuming but also prone to introducing human biases. 2. **Improve the generalization ability of the model**: Existing ML potential functions can usually only handle specific structure types included in the training set and perform poorly for unseen structures. The method proposed in this paper aims to enable ML potential functions to cover a wider range of structures and chemical environments through automated exploration. 3. **Reduce computational cost**: Quantum mechanical methods such as density functional theory (DFT) are accurate but extremely computationally costly. By using ML potential functions, the computational cost can be greatly reduced while ensuring accuracy, thereby accelerating material simulation and structure search. #### Solutions: - **Automation protocol**: Through an iterative loop, randomly generate a set of structures automatically, and use a selection algorithm based on distance metrics (such as the CUR algorithm) to select the most representative structures for DFT calculations. - **Configuration - averaged kernel metric**: Introduce the configuration - averaged SOAP (Smooth Overlap of Atomic Positions) descriptor to quantify the similarity between different structures, thereby selecting the most relevant structures at each step. - **Active learning**: During the iterative process, continuously expand the reference database and refit the ML potential function to ensure that the model can be gradually improved and finally cover a wide range of structure types. This method provides new ideas for the application of ML potential functions in the fields of materials science, physics, and chemistry, and is of particular significance in structure search and new material discovery.