Enhancing the Quality and Reliability of Machine Learning Interatomic Potentials through Better Reporting Practices

Tristan Maxson,Ademola Soyemi,Benjamin W. J. Chen,Tibor Szilvási
2024-01-04
Abstract:Recent developments in machine learning interatomic potentials (MLIPs) have empowered even non-experts in machine learning to train MLIPs for accelerating materials simulations. However, the current literature lacks clear standards for documenting the use of MLIPs, which hinders the reproducibility and independent evaluation of the presented results. In this perspective, we aim to provide guidance on best practices for documenting MLIP use while walking the reader through the development and deployment of MLIPs including hardware and software requirements, generating training data, training models, validating predictions, and MLIP inference. We also suggest useful plotting practices and analyses to validate and boost confidence in the deployed models. Finally, we provide a step-by-step checklist for practitioners to use directly before publication to standardize the information to be reported. Overall, we hope that our work will encourage reliable and reproducible use of these MLIPs, which will accelerate their ability to make a positive impact in various disciplines including materials science, chemistry, and biology, among others.
Chemical Physics,Materials Science
What problem does this paper attempt to address?
This paper focuses on the application of machine learning interatomic potentials (MLIPs) in materials simulation and how to improve their quality and reliability through better reporting practices. Currently, despite MLIPs allowing non-experts to train these models and accelerate materials simulation, there is a lack of clear standards in the literature for documenting the use of MLIPs, which hinders reproducibility and independent evaluation of results. The paper aims to provide guidance for MLIP development and deployment, including hardware and software requirements, training data generation, model training, prediction validation, and best practices for MLIP inference. The authors suggest some useful visualization techniques and analysis methods to validate and enhance confidence in the deployed models. Additionally, they provide a detailed checklist for practitioners to use before publication to standardize the information to be reported. The paper points out that due to the rapid development in the field of MLIPs, standards regarding their use have not yet been established, resulting in inconsistent reporting. This can make it difficult to reproduce results, especially for inexperienced researchers. Therefore, the paper emphasizes the need for more careful documentation and validation of the use of MLIPs to avoid misleading research and reduce uncertainty. In summary, the paper aims to encourage reliable and reproducible use of MLIPs, thereby accelerating their positive impact in fields such as materials science, chemistry, and biology, and promoting the unified development of MLIP standards.