Charting electronic-state manifolds across molecules with multi-state learning and gap-driven dynamics via efficient and robust active learning

Pavlo O. Dral,Mikołaj Martyka,Lina Zhang,Fuchun Ge,Yi-Fan Hou,Joanna Jankowska,Mario Barbatti
DOI: https://doi.org/10.26434/chemrxiv-2024-dtc1w
2024-08-06
Abstract:We present a robust protocol for affordable learning of the electronic-state manifold to accelerate photophysical and photochemical molecular simulations. The protocol solves several pertinent issues precluding the widespread use of machine learning (ML) in excited-state simulations. We introduce a novel physics-informed multi-state ML model that can learn an arbitrary number of excited states across molecules with accuracy better or similar to the accuracy of learning ground-state energies with established ML potentials. We also present gap-driven dynamics for meticulous accelerated sampling of the small-gap regions: which proves crucial for stable surface-hopping dynamics. Put together, multi-state learning and gap-driven dynamics enable efficient active learning furnishing robust models for surface-hopping simulations. Our active-learning protocol includes sampling based on physics-informed uncertainty quantification, ensuring the quality of each adiabatic surface, low error in energy gaps, and precise calculation of the hopping probability. The thresholds for uncertainty quantification are automatically chosen based on statistical and physical considerations. The protocol will be made available with the next release of the open-source MLatom as described at https://github.com/dralgroup/al-namd
Chemistry
What problem does this paper attempt to address?
The paper aims to address the following key issues: 1. **Efficient Learning of Electronic State Manifolds**: The study proposes a new machine learning (ML) protocol to economically learn electronic state manifolds, thereby accelerating photophysical and photochemical molecular simulations. 2. **Multistate Learning and Gap-Driven Dynamics**: A novel physics-informed multistate ML model is introduced, capable of learning any number of excited states across different molecules with accuracy similar to or better than established ML potential energy surface learning for ground state energies. Additionally, a gap-driven dynamics method is proposed to carefully accelerate sampling in small gap regions, which is crucial for the stability of surface hopping dynamics. 3. **Efficient Active Learning Protocol**: The study integrates multistate learning and gap-driven dynamics to achieve an efficient and robust active learning protocol for constructing models suitable for surface hopping simulations. This protocol includes sampling based on physics-informed uncertainty quantification (UQ), ensuring the quality of each adiabatic surface, low gap errors, and accurate calculation of hopping probabilities. 4. **Overcoming Existing Challenges**: The paper addresses several key issues, including predicting dense potential energy surface manifolds with complex topographies and small state-to-state gaps, as well as the problem of learning excited states simultaneously across different molecules and different reference electronic structure levels. These challenges have hindered the widespread application of ML-assisted trajectory surface hopping (TSH) techniques. 5. **Achieving Efficient and User-Friendly Solutions**: Through the aforementioned methods, the researchers demonstrate that final simulation results can be obtained at an economical cost, making the acceleration of TSH via machine learning affordable and achievable within days on commodity hardware. Furthermore, these methods enable the learning of any number of electronic states, not limited to a single molecule but also across different molecules and different reference electronic structure levels. In summary, the study effectively addresses several core challenges in ML-assisted TSH by developing accurate and scalable physics-informed multistate models, proposing methods to accelerate sampling in critical small gap regions, and implementing an end-to-end efficient and robust active learning protocol.