Assessing the accuracy and efficiency of free energy differences obtained from reweighted flow-based probabilistic generative models

Matteo Salvalaglio,Michael Shirts,Ahmad Y Sheikh,Yifei Michelle Liu,Nada Mehio,Edgar Olehnovics
DOI: https://doi.org/10.26434/chemrxiv-2024-z9g39
2024-04-22
Abstract:Computing free energy differences between metastable states characterized by non-overlapping Boltzmann distributions is often a computationally intensive endeavour, usually requiring chains of intermediate states to connect these metastable states. Targeted free energy perturbation (TFEP) can significantly lower the computational cost of FEP calculations by choosing a set of invertible maps used to directly transform the distributions of interest, achieving the necessary statistically significant overlaps without sampling any intermediate states. Probabilistic generative models (PGMs) based on normalising-flow architectures can make it much easier via machine learning to train invertible maps needed for TFEP. However, the accuracy and applicability of approaches based on empirically learned maps depend crucially on the choice of reweighting method adopted to estimate the free energy differences. In this work, we assess the accuracy, rate of convergence, and data efficiency of different free energy estimators, including exponential averaging, BAR, and MBAR, in reweighting PGMs trained by maximum likelihood on limited amounts of molecular dynamics data sampled only from end-states of interest. We carry out the comparisons on a set of simple but representative case studies, including conformational ensembles of alanine dipeptide and ibuprofen. Our results indicate that BAR and MBAR are both data efficient and robust, even in the presence of significant model overfitting in the generation of invertible maps. This analysis can serve as a stepping stone for the deployment of efficient and quantitatively accurate ML-based FE calculation methods in complex systems.
Chemistry
What problem does this paper attempt to address?
This paper evaluates the accuracy and efficiency of using reweighted flow-based probabilistic generative models (PGMs) to calculate free energy differences. In physical modeling, particularly in drug design, estimating the thermodynamic stability of molecular systems is a computationally intensive task that often requires sampling from non-overlapping Boltzmann distributions of metastable states. Targeted free energy perturbation (TFEP) methods can reduce the computational cost by directly transforming the distribution of interest through a reversible mapping. The paper investigates the accuracy, convergence speed, and data efficiency of different free energy estimators, such as exponential averaging (EXP), Bennett acceptance ratio (BAR), and multistate Bennett acceptance ratio (MBAR), when training PGMs on limited molecular dynamics data. The study demonstrates, through a series of simple yet representative case studies including the conformational ensembles of dipeptides and ibuprofen, that even in cases of severe overfitting, BAR and MBAR methods still exhibit data efficiency and robustness in the absence of significant overlap. The authors emphasize the importance of strategies to avoid overfitting and propose heuristic methods for identifying overfitting and statistically consistent free energy estimates in the absence of reference true values. They compare the quantitative accuracy and convergence properties of standard free energy estimators by studying different dimensional model systems and compare them to benchmark results obtained using biased molecular dynamics simulations, such as temperature-controlled metadynamics. In summary, the paper aims to fill the gaps in existing literature and provide guidance for the development of machine learning-based, efficient, and quantitatively accurate free energy calculation methods in complex systems.