DiffractGPT: Atomic Structure Determination from X-ray Diffraction Patterns using Generative Pre-trained Transformer

Kamal Choudhary
DOI: https://doi.org/10.26434/chemrxiv-2024-ztp85
2024-11-21
Abstract:Crystal structure determination from powder diffraction patterns is a complex challenge in materials science, often requiring extensive expertise and computational resources. This study introduces DiffractGPT, a generative pre-trained transformer model designed to predict atomic structures directly from X-ray diffraction (XRD) patterns. By capturing the intricate relationships between diffraction patterns and crystal structures, DiffractGPT enables fast and accurate inverse design. Trained on thousands of atomic structures and their simulated XRD patterns from the JARVIS-DFT dataset, we evaluate the model across three scenarios: (1) without chemical information, (2) with a list of elements, and (3) with an explicit chemical formula. The results demonstrate that incorporating chemical information significantly enhances prediction accuracy. Additionally, the training process is straightforward and fast, bridging gaps between computational, data science, and experimental communities. This work represents a significant advancement in automating crystal structure determination, offering a robust tool for data-driven materials discovery and design.
Chemistry
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the complex challenge of determining crystal structures from powder diffraction patterns in materials science. Specifically, the current process of determining crystal structures usually requires extensive expertise and computational resources and involves a great deal of trial and error. The paper introduces a generative pre - trained Transformer model named DiffractGPT, which aims to predict atomic structures directly from X - ray diffraction (XRD) patterns. By capturing the complex relationships between diffraction patterns and crystal structures, DiffractGPT can achieve rapid and accurate inverse design. The study evaluated the performance of the model in three different situations: (1) without chemical information, (2) with a list of elements provided, and (3) with an explicit chemical formula provided. The results show that the addition of chemical information significantly improves the prediction accuracy. In addition, the training process is simple and fast, which helps to bridge the gap between the computing, data science, and experimental communities. This work represents a significant advance in automated crystal - structure determination and provides a powerful tool for data - driven materials discovery and design.