Crystal Structure Generation Based On Material Properties

Chao Huang,JiaHui Chen,HongRui Liang,ChunYan Chen,Chen Chen
2024-11-13
Abstract:The discovery of new materials is very important to the field of materials science. When researchers explore new materials, they often have expected performance requirements for their crystal structure. In recent years, data-driven methods have made great progress in the direction plane of crystal structure generation, but there is still a lack of methods that can effectively map material properties to crystal structure. In this paper, we propose a Crystal DiT model to generate the crystal structure from the expected material properties by embedding the material properties and combining the symmetry information predicted by the large language model. Experimental verification shows that our proposed method has good performance.
Artificial Intelligence,Materials Science
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to generate crystal structures from material properties. Specifically, current data - driven methods have made remarkable progress in crystal structure generation, but lack methods that can effectively map material properties to crystal structures. This paper proposes a model named Crystal DiT, which achieves the goal of generating crystal structures from expected material properties by embedding material properties and combining symmetry information predicted by large - language models. Experimental verification shows that this method has good performance. ### Main Contributions 1. **Process Division**: Divide the entire crystal structure generation process into two parts: - Part One: Generate space - group information according to the required material properties, which is completed by the GLM4 model. - Part Two: Generate crystal structures based on material properties and space - group information, which is completed by the DiT model. 2. **Fine - Tuning of the GLM4 Model**: Fine - tune the GLM4 model through prompt engineering, enabling it to output reasonable crystal space - groups and Wyckoff positions according to the input element types and material properties. 3. **Design of the DiT Model**: Propose a DiT model with symmetry - information constraints. By introducing material - property embedding and a crystal - graph - structure Transformer, the model can generate the expected crystal structures under the constraints of material properties and space - groups. 4. **Experimental Verification**: All models are trained and adapted on the NVIDIA platform and the Ascend Atlas 800T A2 platform. Experimental results show that the proposed method can generate stable crystal structures that meet the expected performance requirements under the constraints of material properties and space - groups. ### Related Work - **Diffusion Models**: Diffusion models are powerful generative models. They generate data by simulating the process of gradually introducing noise and then learning how to reverse this process. Several diffusion models mentioned in this paper, such as DDPM, NCSN, etc., have achieved remarkable results in tasks such as image generation, super - resolution, and image restoration. - **Large - Language Models**: Traditional language models such as T5 and BERT have shown certain performance in the field of crystallography. However, these methods are currently limited to one - way prediction of the properties of known crystal structures and lack the ability to combine crystallographic prior knowledge with the powerful capabilities of generative models to inversely generate reasonable and diverse crystal structures. - **Data - Driven Crystal - Generation Models**: In recent years, researchers have developed a variety of generative models based on different data representations, including the use of atomic species on lattice sites, voxel representations, and distance matrices. Several models mentioned in this paper, such as CDVAE, DiffCSP, PGCGM, etc., all perform well in the crystal - structure - generation task. ### Method Introduction - **Large - Language Models for Crystal - Symmetry - Information Prediction**: Use the open - source GLM4 large - language model, create a dataset through prompt engineering and fine - tune the model, enabling it to learn the correspondence between atomic properties, material properties, and crystal - symmetry information. - **Crystal - Structure Diffusion Models and Transformers**: Design a crystal - structure DiT model, receive the symmetry information output by the large - language model, and generate the expected crystal structures according to the initially input atomic and material properties. The model is mainly improved from the DiffCSP++ model by introducing a crystal multi - head - attention module to fuse crystal features and expected material properties. ### Experimental Results - **Crystal - Symmetry - Information Prediction**: Fine - tune using a material dataset of 152,823 materials in the Materials Project database, and evaluate the accuracy of the data generated by the model through Rouge - 1, Rouge - 2, and Rouge - L metrics. - **Crystal - Structure Generation**: In the test dataset, use the SpacegroupAnalyzer class and the pyxtal tool to obtain the space - group and Wyckoff positions of each material, as well as the corresponding material properties. Input these data into the Crystal DiT model to generate crystal structures, and calculate the matching rate and RMSD through the Structure - Matcher class. Experimental results show that the proposed method performs excellently in the crystal - structure - generation task. ### Conclusion This paper proposes a crystal - generation method Uni - MDM based on large - language models and DiT, which effectively integrates...