Discovering the Building Blocks of Atomic Systems using Machine Learning

Conrad W. Rosenbrock,Eric R. Homer,Gábor Csányi,Gus L. W. Hart
DOI: https://doi.org/10.1038/s41524-017-0027-x
2017-03-18
Abstract:Machine learning has proven to be a valuable tool to approximate functions in high-dimensional spaces. Unfortunately, analysis of these models to extract the relevant physics is never as easy as applying machine learning to a large dataset in the first place. Here we present a description of atomic systems that generates machine learning representations with a direct path to physical interpretation. As an example, we demonstrate its usefulness as a universal descriptor of grain boundary systems. Grain boundaries in crystalline materials are a quintessential example of a complex, high-dimensional system with broad impact on many physical properties including strength, ductility, corrosion resistance, crack resistance, and conductivity. In addition to modeling such properties, the method also provides insight into the physical "building blocks" that influence them. This opens the way to discover the underlying physics behind behaviors by understanding which building blocks map to particular properties. Once the structures are understood, they can then be optimized for desirable behaviors.
Materials Science,Computational Physics
What problem does this paper attempt to address?
The core problem that this paper attempts to solve is: **How to effectively represent and understand the physical properties of atomic systems through machine - learning methods, especially the complex high - dimensional structures of grain boundaries (GBs) and their impact on material properties**. Specifically, the authors propose a new description method, enabling the machine - learning model to not only accurately predict the properties such as the energy of grain boundaries, the trend of temperature - dependent mobility, and shear coupling, but also provide physical explanations behind these predictions. This helps to discover the "building blocks" that control these properties, namely local atomic environments (LAEs). ### Main problems and goals of the paper: 1. **Interpretability of high - dimensional data**: - Although traditional machine - learning methods can fit high - dimensional data well, they are often difficult to explain the physical mechanisms behind them. - The method proposed in this paper aims to generate representations with direct physical explanations through machine - learning, thereby revealing the key atomic structures that affect material properties. 2. **Relationship between grain - boundary structures and properties**: - Grain boundaries are a kind of defect in crystalline materials and have an important impact on properties such as the strength, ductility, corrosion resistance, crack resistance, and electrical conductivity of materials. - Researchers hope to find the relationship between the local atomic environment of grain boundaries and their physical properties through machine - learning methods, and then optimize these properties. 3. **Simplifying the search in high - dimensional space**: - The atomic structure of grain boundaries is very complex and has extremely high dimensions, and it is almost impossible to directly search all possible structures. - By classifying the local atomic environments of grain boundaries into a limited number of unique structures, researchers have greatly simplified the search space, making it possible to find grain boundaries with specific properties. ### Method overview: - **Smooth Overlap of Atomic Positions (SOAP) descriptor**: Used to represent the local environment of a single atom, with translation, rotation, and permutation invariance. - **Averaged SOAP Representation (ASR)**: Obtain an overall description by averaging the local environments of each grain boundary. - **Local Environment Representation (LER)**: Extract unique local atomic environments by comparing the local environments of all atoms and use them to represent the entire grain boundary. ### Main contributions: - Proposed two complementary machine - learning methods (ASR and LER), which have respective advantages in prediction performance and physical interpretability. - Discovered the key local atomic environments (LAEs) in grain boundaries, which are highly correlated with the physical properties of grain boundaries. - Through machine - learning, revealed the building blocks of grain boundaries, providing a theoretical basis for further optimizing material properties. In short, the goal of this paper is to not only improve the prediction ability of grain - boundary properties through machine - learning methods, but also deeply understand the physical mechanisms behind these properties, thereby providing new research tools and ideas for the field of materials science.