Machine Learning Guided AQFEP: A Fast & Efficient Absolute Free Energy Perturbation Solution for Virtual Screening

Andrea Bortolato,Jordan E. Crivelli-Decker,Zane Beckwith,Gary Tom,Ly Le,Romelia Salomon-Ferrer,Jackson Beall,Rafael Gomez-Bombarelli,Sheenam Khuttan
DOI: https://doi.org/10.26434/chemrxiv-2023-z3t3b
2023-12-22
Abstract:Structure-based methods in drug discovery have become an integral part of the modern drug discovery process. The power of virtual screening lies in its ability to rapidly and cost-effectively explore enormous chemical spaces to select promising ligands for further experimental investigation. Relative Free Energy Perturbation (RFEP) and similar methods are the gold standard for binding affinity prediction in drug discovery hit-to-lead and lead optimization phases, but have high computational cost and the requirement of a structural analog with a known activity. Without a reference molecule requirement, Absolute FEP (AFEP) has, in theory, better accuracy for hit ID, but in practice, the slow throughput is not compatible with VS, where fast docking and unreliable scoring functions are still the standard. Here, we present an integrated workflow to virtually screen large and diverse chemical libraries efficiently, combining active learning with a physics-based scoring function based on a fast absolute free energy perturbation method. We validated the performance of the approach in the ranking of structurally related ligands, virtual screening hit rate enrichment, and active learning chemical space exploration; disclosing the largest reported collection of free energy simulations to date.
Chemistry
What problem does this paper attempt to address?
This paper focuses on the issues in virtual screening during drug discovery. Current virtual screening methods have accuracy issues in predicting molecular binding affinity, particularly in the inaccuracy of scoring functions and the reliance on structural similarity. The paper proposes a new method called AQFEP (Accelerated Quantitative Free Energy Perturbation), which combines machine learning and physics-based scoring functions to improve efficiency and reduce computational costs. In traditional virtual screening, the relative free energy perturbation (RFEP) method is considered the gold standard, but it requires expensive computational costs and known active analogues. On the other hand, the absolute free energy perturbation (AFEP) method is theoretically more accurate, but it is not suitable for large-scale screening due to its slow speed. The AQFEP method introduced in the paper improves the efficiency of screening large and diverse chemical libraries through fast absolute free energy perturbation calculations and active learning strategies. This includes using Bayesian optimization algorithms to reduce the computational costs of performing AQFEP calculations on high-scoring compounds. The study also compares the performance of AQFEP with other common methods such as random search and top-docking compounds, and evaluates its effects on various indicators such as ranking of structurally related ligands, enhancement of virtual screening hit rates, and active learning chemical space exploration. The results show that AQFEP maintains good predictive performance while improving efficiency. In summary, this paper aims to address the accuracy and efficiency issues in virtual screening by introducing the AQFEP method to improve existing technologies and enhance efficiency and predictive accuracy in the drug discovery process.