Data-driven discovery of 2D materials by deep generative models

Peder Lyngby,Kristian Sommer Thygesen
DOI: https://doi.org/10.1038/s41524-022-00923-3
IF: 12.256
2022-11-12
npj Computational Materials
Abstract:Efficient algorithms to generate candidate crystal structures with good stability properties can play a key role in data-driven materials discovery. Here, we show that a crystal diffusion variational autoencoder (CDVAE) is capable of generating two-dimensional (2D) materials of high chemical and structural diversity and formation energies mirroring the training structures. Specifically, we train the CDVAE on 2615 2D materials with energy above the convex hull Δ H hull < 0.3 eV/atom, and generate 5003 materials that we relax using density functional theory (DFT). We also generate 14192 new crystals by systematic element substitution of the training structures. We find that the generative model and lattice decoration approach are complementary and yield materials with similar stability properties but very different crystal structures and chemical compositions. In total we find 11630 predicted new 2D materials, where 8599 of these have Δ H hull < 0.3 eV/atom as the seed structures, while 2004 are within 50 meV of the convex hull and could potentially be synthesised. The relaxed atomic structures of all the materials are available in the open Computational 2D Materials Database (C2DB). Our work establishes the CDVAE as an efficient and reliable crystal generation machine, and significantly expands the space of 2D materials.
materials science, multidisciplinary,chemistry, physical
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to efficiently generate two - dimensional materials (2D materials) with excellent stability through data - driven methods. Specifically, the researchers utilized a deep - generation model - Crystal Diffusion Variational Autoencoder (CDVAE) - to generate chemically and structurally diverse two - dimensional materials and ensure that their formation energies are similar to those of the materials in the training set. In addition, they also generated new crystal structures through the method of systematic element substitution. ### Main Problems 1. **Efficient Generation of Candidate Crystal Structures**: Materials generated by traditional lattice - modification - based methods are often similar to reference materials, which limits the diversity of new materials. Therefore, a method that can generate more diverse materials is required. 2. **Stability Evaluation**: Newly - generated materials need to have good stability, that is, their formation energies should be close to or lower than the convex hull energy. This helps to screen out materials that may be actually synthesized. 3. **Expansion of 2D Material Space**: Significantly expand the known two - dimensional material space by generating a large number of new structures, providing more possibilities for new material discovery. ### Solutions - **CDVAE Model**: By training the CDVAE model, two - dimensional materials with high chemical and structural diversity are generated, and it is ensured that their formation energies are similar to those of the materials in the training set. - **Lattice Decoration Protocol (LDP)**: New crystal structures are generated by systematically replacing elements in the materials of the training set. These two methods are complementary and can generate different types of materials. ### Experimental Results - 5,003 new materials were generated through CDVAE and relaxation calculations were carried out through density functional theory (DFT). - 14,192 new materials were generated through LDP. - A total of 11,630 predicted new two - dimensional materials were discovered, among which the energies of 8,599 materials are higher than the convex hull energy ΔH<sub>hull</sub> < 0.3 eV/atom, and the energies of 2,004 materials are within 50 meV and may be synthesizable. ### Conclusion This research has not only successfully generated a large number of new two - dimensional materials, but also these materials have high stability and diversity. In particular, the CDVAE model performs excellently in learning the stability characteristics of the materials in the training set and can generate new material categories not in the training set. This provides new ideas and tools for future autonomous material discovery.