Towards Quantitative Evaluation of Crystal Structure Prediction Performance

Lai Wei,Qin Li,Sadman Sadeed Omee,Jianjun Hu
2023-07-12
Abstract:Crystal structure prediction (CSP) is now increasingly used in the discovery of novel materials with applications in diverse industries. However, despite decades of developments, the problem is far from being solved. With the progress of deep learning, search algorithms, and surrogate energy models, there is a great opportunity for breakthroughs in this area. However, the evaluation of CSP algorithms primarily relies on manual structural and formation energy comparisons. The lack of a set of well-defined quantitative performance metrics for CSP algorithms make it difficult to evaluate the status of the field and identify the strengths and weaknesses of different CSP algorithms. Here, we analyze the quality evaluation issue in CSP and propose a set of quantitative structure similarity metrics, which when combined can be used to automatically determine the quality of the predicted crystal structures compared to the ground truths. Our CSP performance metrics can be then utilized to evaluate the large set of existing and emerging CSP algorithms, thereby alleviating the burden of manual inspection on a case-by-case basis. The related open-source code can be accessed freely at <a class="link-external link-https" href="https://github.com/usccolumbia/CSPBenchMetrics" rel="external noopener nofollow">this https URL</a>
Materials Science
What problem does this paper attempt to address?
### What problem does this paper attempt to solve? This paper aims to solve the **quantitative evaluation problem of crystal structure prediction (CSP) algorithms**. Specifically, the paper points out that although crystal structure prediction is of great significance in the field of materials science, and in recent years, with the progress of deep learning, search algorithms, and surrogate energy models, CSP has made remarkable progress, there is still a lack of a complete set of **quantitative performance evaluation metrics** to objectively evaluate the performance of different CSP algorithms. #### Main problems include: 1. **Limitations of manual evaluation**: At present, the evaluation of most CSP algorithms depends on manual comparison of structures and formation energies, which is not only time - consuming and labor - intensive but also difficult to standardize. 2. **Lack of unified standards**: There is no set of clear quantitative metrics to measure the similarity between the predicted crystal structure and the real structure, making it difficult to systematically evaluate and compare the advantages and disadvantages of different algorithms. 3. **Subjectivity of evaluation methods**: Different researchers use different methods when verifying the predicted structure, which has a certain degree of subjectivity and arbitrariness, making the results difficult to reproduce and compare. #### Solutions proposed in the paper: To meet the above challenges, the paper proposes a set of **quantitative structure similarity measurement metrics**, which can automatically evaluate the quality difference between the predicted crystal structure and the real structure. By combining multiple measurement metrics, the similarity between structures can be captured more comprehensively, thus providing an objective and systematic evaluation framework for CSP algorithms. #### Specific contributions: - **Introducing multiple distance metrics**: Such as energy distance (ED), Wyckoff position fractional coordinate distance (WD), adjacency matrix distance (AMD), Pymatgen RMS distance (PRD), Sinkhorn distance (SD), Chamfer distance (CD), Hausdorff distance (HD), Superpose distance (SPD), graph edit distance (GED), X - ray diffraction pattern distance (XD), and orbital field matrix distance (OD). - **Evaluating the effectiveness of measurement metrics**: The correlation between these measurement metrics and structural perturbations has been verified through experiments to ensure that they can accurately reflect the closeness between the predicted structure and the real structure. - **Visualizing search trajectories**: These measurement metrics are used to perform visual analysis on the search processes of different CSP algorithms, revealing the behavioral characteristics of each algorithm during the optimization process. In short, this paper is committed to establishing a standardized and objective evaluation system to promote the further development of the CSP field and provide reliable tools for researchers to evaluate and improve existing CSP algorithms.