Active learning of ternary alloy structures and energies

Gaurav Deshmukh,Noah J. Wichrowski,Nikolaos Evangelou,Pushkar G. Ghanekar,Siddharth Deshpande,Ioannis G. Kevrekidis,Jeffrey Greeley
DOI: https://doi.org/10.1038/s41524-024-01256-z
IF: 12.256
2024-05-31
npj Computational Materials
Abstract:Machine learning models with uncertainty quantification have recently emerged as attractive tools to accelerate the navigation of catalyst design spaces in a data-efficient manner. Here, we combine active learning with a dropout graph convolutional network (dGCN) as a surrogate model to explore the complex materials space of high-entropy alloys (HEAs). We train the dGCN on the formation energies of disordered binary alloy structures in the Pd-Pt-Sn ternary alloy system and improve predictions on ternary structures by performing reduced optimization of the formation free energy, the target property that determines HEA stability, over ensembles of ternary structures constructed based on two coordinate systems: (a) a physics-informed ternary composition space, and (b) data-driven coordinates discovered by the Diffusion Maps manifold learning scheme. Both reduced optimization techniques improve predictions of the formation free energy in the ternary alloy space with a significantly reduced number of DFT calculations compared to a high-fidelity model. The physics-based scheme converges to the target property in a manner akin to a depth-first strategy, whereas the data-driven scheme appears more akin to a breadth-first approach. Both sampling schemes, coupled with our acquisition function, successfully exploit a database of DFT-calculated binary alloy structures and energies, augmented with a relatively small number of ternary alloy calculations, to identify stable ternary HEA compositions and structures. This generalized framework can be extended to incorporate more complex bulk and surface structural motifs, and the results demonstrate that significant dimensionality reduction is possible in thermodynamic sampling problems when suitable active learning schemes are employed.
materials science, multidisciplinary,chemistry, physical
What problem does this paper attempt to address?
The problem that this paper attempts to address is the design and optimization of high-entropy alloys (HEAs). Specifically, the researchers aim to efficiently explore and predict the structure and formation free energy of ternary alloy systems using machine learning methods, particularly by combining active learning and graph convolutional networks (dGCN), while keeping computational costs low. The main challenge lies in the highly complex material space of high-entropy alloys, where traditional first-principles methods such as density functional theory (DFT) are costly and difficult to handle for such large-scale data. To tackle this challenge, the paper proposes an improved active learning workflow, achieved through the following steps: 1. **Initial Model Training**: First, the dGCN model is trained using DFT calculation data of binary alloys to predict formation energy and its uncertainty. 2. **Generation of Candidate Structures**: Candidate structures of ternary alloys are generated, and the trained dGCN model is used to predict the formation energy and uncertainty of these structures. 3. **Selection of Optimization Coordinates**: Candidate structures are grouped into sets using two different methods (physics-inspired and data-driven), and the sets most likely to contain stable structures are selected for further DFT calculations. 4. **Iterative Optimization**: DFT calculations are performed on the selected sets, the results are added to the training set, the dGCN model is retrained, and the process is repeated to gradually improve the model's prediction accuracy. Through this method, researchers can effectively identify stable ternary high-entropy alloy compositions and structures while significantly reducing the amount of DFT calculations. This provides a new and efficient approach for the design and optimization of high-entropy alloys.