Data-efficient and Interpretable Inverse Materials Design using a Disentangled Variational Autoencoder

Cheng Zeng,Zulqarnain Khan,Nathan L. Post
2024-09-10
Abstract:Inverse materials design has proven successful in accelerating novel material discovery. Many inverse materials design methods use unsupervised learning where a latent space is learned to offer a compact description of materials representations. A latent space learned this way is likely to be entangled, in terms of the target property and other properties of the materials. This makes the inverse design process ambiguous. Here, we present a semi-supervised learning approach based on a disentangled variational autoencoder to learn a probabilistic relationship between features, latent variables and target properties. This approach is data efficient because it combines all labelled and unlabelled data in a coherent manner, and it uses expert-informed prior distributions to improve model robustness even with limited labelled data. It is in essence interpretable, as the learnable target property is disentangled out of the other properties of the materials, and an extra layer of interpretability can be provided by a post-hoc analysis of the classification head of the model. We demonstrate this new approach on an experimental high-entropy alloy dataset with chemical compositions as input and single-phase formation as the single target property. While single property is used in this work, the disentangled model can be extended to customize for inverse design of materials with multiple target properties.
Machine Learning,Materials Science
What problem does this paper attempt to address?
### Problems the Paper Aims to Solve This paper aims to address several key issues in the inverse design of materials: 1. **Data Efficiency**: A semi-supervised learning method is proposed to improve the data efficiency of the model by combining labeled and unlabeled data. This method leverages expert prior knowledge to constrain model fitting and exhibits lower variance when predicting unknown data. 2. **Interpretability**: By using a Disentangled Variational Autoencoder (DVAE) to learn the probabilistic relationships between material features and target properties, the target properties are decoupled from other material properties, thereby enhancing the model's interpretability. 3. **Multi-attribute Optimization**: Although the paper primarily focuses on single-phase High-Entropy Alloys (HEAs), the method can be extended to material discovery tasks with multiple target properties, thus meeting more complex material design requirements. Specifically, the goal of the paper is to develop a workflow that can efficiently and interpretably perform inverse material design, particularly in the design of HEAs, which tend to form single-phase structures. Traditional experimental design, thermodynamic modeling, and first-principles calculation methods are inefficient in searching for HEAs because the number of element combinations increases exponentially with the number of elements. Therefore, the paper proposes a method based on Disentangled Variational Autoencoder for the inverse design of complex materials and demonstrates its application in single-phase HEAs.