CSPBench: a benchmark and critical evaluation of Crystal Structure Prediction

Lai Wei,Sadman Sadeed Omee,Rongzhi Dong,Nihang Fu,Yuqi Song,Edirisuriya M. D. Siriwardane,Meiling Xu,Chris Wolverton,Jianjun Hu
2024-06-30
Abstract:Crystal structure prediction (CSP) is now increasingly used in discovering novel materials with applications in diverse industries. However, despite decades of developments and significant progress in this area, there lacks a set of well-defined benchmark dataset, quantitative performance metrics, and studies that evaluate the status of the field. We aim to fill this gap by introducing a CSP benchmark suite with 180 test structures along with our recently implemented CSP performance metric set. We benchmark a collection of 13 state-of-the-art (SOTA) CSP algorithms including template-based CSP algorithms, conventional CSP algorithms based on DFT calculations and global search such as CALYPSO, CSP algorithms based on machine learning (ML) potentials and global search, and distance matrix based CSP algorithms. Our results demonstrate that the performance of the current CSP algorithms is far from being satisfactory. Most algorithms cannot even identify the structures with the correct space groups except for the template-based algorithms when applied to test structures with similar templates. We also find that the ML potential based CSP algorithms are now able to achieve competitive performances compared to the DFT-based algorithms. These CSP algorithms' performance is strongly determined by the quality of the neural potentials as well as the global optimization algorithms. Our benchmark suite comes with a comprehensive open-source codebase and 180 well-selected benchmark crystal structures, making it convenient to evaluate the advantages and disadvantages of CSP algorithms from future studies. All the code and benchmark data are available at <a class="link-external link-https" href="https://github.com/usccolumbia/cspbenchmark" rel="external noopener nofollow">this https URL</a>
Materials Science
What problem does this paper attempt to address?
This paper aims to address the lack of standardized benchmark, quantitative performance metrics, and status assessment in the Crystal Structure Prediction (CSP) field. Currently, despite decades of development and significant progress in CSP methods, there is still no recognized dataset to measure the performance of different algorithms. Researchers have proposed a CSP benchmark suite consisting of 180 test structures and introduced new CSP performance metrics. Thirteen state-of-the-art CSP algorithms, including template-based methods, density functional theory (DFT) based traditional algorithms, machine learning potentials, global search algorithms, and distance matrix-based methods, were evaluated. The results show that the current CSP algorithms' performance is not satisfactory, as most algorithms fail to identify structures with the correct space group. However, machine learning potential-based CSP algorithms now achieve comparable performance to DFT-based algorithms. The performance of these algorithms depends on the quality of neural potentials and global optimization algorithms. The benchmark suite provides a comprehensive open-source code repository and selected crystal structures to facilitate the evaluation of the strengths and weaknesses of CSP algorithms in future research. Through this benchmark testing, researchers hope to promote progress in the CSP field and the discovery of new materials.