Electronic Excited States from Physically Constrained Machine Learning

Edoardo Cignoni,Divya Suman,Jigyasa Nigam,Lorenzo Cupellini,Benedetta Mennucci,Michele Ceriotti
DOI: https://doi.org/10.1021/acscentsci.3c01480
IF: 18.2
2024-02-29
ACS Central Science
Abstract:Data-driven techniques are increasingly used to replace electronic-structure calculations of matter. In this context, a relevant question is whether machine learning (ML) should be applied directly to predict the desired properties or combined explicitly with physically grounded operations. We present an example of an integrated modeling approach in which a symmetry-adapted ML model of an effective Hamiltonian is trained to reproduce electronic excitations from a quantum-mechanical calculation. The resulting model can make predictions for molecules that are much larger and more complex than those on which it is trained and allows for dramatic computational savings by indirectly targeting the outputs of well-converged calculations while using a parametrization corresponding to a minimal atom-centered basis. These results emphasize the merits of intertwining data-driven techniques with physical approximations, improving the transferability and interpretability of ML models without affecting their accuracy and computational efficiency and providing a blueprint for developing ML-augmented electronic-structure methods.
chemistry, multidisciplinary
What problem does this paper attempt to address?
The paper primarily addresses the problem of efficiently predicting molecular electronic excited states by combining machine learning (ML) methods with quantum mechanics (QM) calculations. Specifically, the research team developed an integrated modeling approach that uses a symmetry-adapted machine learning model to predict an effective Hamiltonian, which is trained to reproduce electronic excitations obtained from quantum mechanical calculations. This approach allows for predictions on larger and more complex molecules than those in the training data, while significantly reducing computational costs. Key points of the paper include: - Proposing an indirect learning framework where the machine learning model is used to predict elements of a single-particle effective Hamiltonian, which are then used to compute molecular orbital energy levels and other electronic structure properties. - Discussing several different training strategies, including learning directly for Hamiltonian matrix elements, learning only for molecular orbital energy levels under the minimal basis set, and learning considering both energy levels and Löwdin charges. - Demonstrating that training with larger basis sets as targets, even when the model architecture is still based on the minimal basis set, can achieve high accuracy. - Showing that the effective Hamiltonian obtained from machine learning can be further used to compute electronic excited state energies with good accuracy. - Indicating that the method is not only applicable to small molecules in the training set but also generalizes well to larger molecular systems, including long-chain polyenes and aromatic compounds. - Finally, illustrating how the hybrid model can be combined with advanced simulation techniques to predict fine quantum mechanical effects, such as the vibrational spectrum of an anthracene molecule, at low cost. In summary, this work presents a novel approach that enhances the transferability and interpretability of machine learning models by combining data-driven techniques with physics-based approximations, while maintaining high accuracy and computational efficiency.