Predictive crystallography at scale: mapping, validating, and learning from 1,000 crystal energy landscapes

Christopher Taylor,Patrick Butler,Graeme Matthew Day
DOI: https://doi.org/10.1039/d4fd00105b
2024-06-05
Faraday Discussions
Abstract:Computational crystal structure prediction (CSP) is an increasingly powerful technique in materials discovery, due to its ability to reveal trends and permit insight across the possibility space of crystal structures of a candidate molecule, beyond simply the observed structure(s). In this work, we demonstrate the reliability and scalability of CSP methods for small, rigid organic molecules by performing in-depth CSP investigations for over 1000 such compounds, the largest survey of its kind to-date. We show that this highly-efficient force-field-based CSP approach is superbly predictive, locating 99.4\% of observed experimental structures, and ranking a large majority of these (74\%) as among the most stable possible structures (to within uncertainty due to thermal effects). We present two examples of insights such large predicted datasets can permit, examining the space group preferences of organic molecular crystals and rationalising empirical rules concerning the spontaneous resolution of chiral molecules. Finally, we exploit this large and diverse dataset for developing transferable machine-learned energy potentials for the organic solid state, training a neural network lattice energy correction to force field energies that offers substantial improvements to the already impressive energy rankings, and a MACE equivariant message-passing neural network for crystal structure reoptimisation. We conclude that the excellent performance and reliability of the CSP workflow enables the creation of very large datasets of broad utility and explanatory power in materials design.
chemistry, physical
What problem does this paper attempt to address?
The paper discusses the problem of predictive crystallography on a large scale, specifically for small organic crystals. In the study, the authors performed crystal structure predictions for over 1000 small molecules using computer simulation techniques, making it the largest scale study of its kind to date. They employed a force field-based efficient computational method and successfully predicted 99.4% of experimentally observed structures, with the majority (74%) of experimental structures being ranked as one of the most stable structures. Additionally, they developed transferable potential models using machine learning to improve energy ranking and crystal structure refinement. The paper also showcases the insights that can be provided by such large-scale predictive datasets, such as analyzing the space group preferences of organic molecular crystals and the rules for spontaneous resolution of chiral molecules. Finally, the researchers trained a neural network for crystal structure refinement, demonstrating the excellent performance and reliability of their computational crystallography workflow in generating large datasets with wide applicability and interpretability to facilitate materials design. The focus of the study lies in validating and expanding the capability of predicting crystal structures and its potential applications in materials discovery.