Multi-Fidelity Machine Learning for Excited State Energies of Molecules

Vivin Vinod,Sayan Maity,Peter Zaspel,Ulrich Kleinekathöfer
DOI: https://doi.org/10.1021/acs.jctc.3c00882
2023-05-19
Abstract:The accurate but fast calculation of molecular excited states is still a very challenging topic. For many applications, detailed knowledge of the energy funnel in larger molecular aggregates is of key importance requiring highly accurate excited state energies. To this end, machine learning techniques can be an extremely useful tool though the cost of generating highly accurate training datasets still remains a severe challenge. To overcome this hurdle, this work proposes the use of multi-fidelity machine learning where very little training data from high accuracies is combined with cheaper and less accurate data to achieve the accuracy of the costlier level. In the present study, the approach is employed to predict the first excited state energies for three molecules of increasing size, namely, benzene, naphthalene, and anthracene. The energies are trained and tested for conformations stemming from classical molecular dynamics simulations and from real-time density functional tight-binding calculations. It can be shown that the multi-fidelity machine learning model can achieve the same accuracy as a machine learning model built only on high cost training data while having a much lower computational effort to generate the data. The numerical gain observed in these benchmark test calculations was over a factor of 30 but certainly can be much higher for high accuracy data.
Chemical Physics,Machine Learning,Computational Physics
What problem does this paper attempt to address?
The paper is primarily dedicated to addressing the challenges faced in the calculation of excited state energies in molecules, particularly how to quickly and accurately compute the excited state energies in larger sets of molecules. Specifically, the goals of the paper can be summarized as follows: 1. **Improve computational efficiency**: Traditionally, high-precision excited state energy calculations are very time-consuming and computationally expensive. Therefore, the paper proposes a Multi-Fidelity Machine Learning (MFML) method to reduce the computational cost required to generate training datasets while maintaining prediction accuracy. 2. **Reduce the need for expensive data**: To train accurate machine learning models, a large number of high-precision data points are usually required. However, obtaining these data points is very costly. The MFML method overcomes this obstacle by combining a small amount of high-precision data with a large amount of low-cost but less accurate data. 3. **Validate the effectiveness of the MFML method**: The paper validates the effectiveness and accuracy of the MFML method in predicting the first excited state energy through case studies of three progressively larger molecules (benzene, naphthalene, and anthracene). By comparing the results of single-fidelity machine learning models and MFML models containing different fidelity levels, it is demonstrated that the MFML method can significantly reduce the amount of high-precision data required while achieving similar prediction accuracy. In summary, the main goal of the paper is to develop and evaluate a new MFML method that can maintain high accuracy while significantly reducing costs, thereby making large-scale excited state energy calculations feasible. Through application in real-world cases, the paper demonstrates the effectiveness and practicality of the MFML method.