Benchmarking HelixFold3-Predicted Holo Structures for Relative Free Energy Perturbation Calculations

Kairi Furui,Masahito Ohue
DOI: https://doi.org/10.1101/2024.10.27.620454
2024-10-29
Abstract:AlphaFold2 demonstrated remarkable capabilities for protein structure prediction. However, it is limited to downstream tasks, such as ligand docking and free energy calculations, as it cannot predict holo structures with bound ligands. AlphaFold3, a state-of-the-art protein structure prediction model, can predict the binding structures of complexes with proteins, nucleic acids, small molecules, ions, and modified residues with cutting-edge performance. However, AlphaFold3 does not currently provide access to some functions, such as the prediction of protein-ligand complex structures. To reduce the enormous costs in early small molecule drug discovery, verifying the utility of protein-ligand complex prediction methods, such as AlphaFold3, in downstream tasks like free energy perturbation calculations is crucial. In this study, we evaluated HelixFold3, designed to emulate AlphaFold3, in predicting holo and apo structures' complex formations and examined its utility in free energy perturbation calculations. Regarding the complex structure prediction performance of the 8 targets from Wang \etal's FEP benchmark, HelixFold3 showed superior performance to AlphaFold2 and existing methods. Predicting a holo structure rather than an apo structure resulted in higher binding site prediction accuracy. Furthermore, using HelixFold3 predicted structures in practical situations, where binding free energies of all derivatives were estimated, both structures achieved accuracies comparable to crystal structures. Additionally, novel derivatives not included in the training data were accurately predicted, demonstrating that free energy calculations using these novel structures are sufficiently usable.
Bioinformatics
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is how to utilize the protein - ligand complex structures (holo structures) predicted by HelixFold3 in the application of free energy perturbation (FEP) calculations, so as to reduce the huge cost of early small - molecule drug discovery. Specifically, the paper focuses on the following aspects: 1. **Functional limitations of AlphaFold3**: - Although AlphaFold3 can predict the binding structures of proteins, nucleic acids, small molecules, ions and modified residues, some of its functions (such as protein - ligand complex structure prediction) are not yet fully open. - HelixFold3 is a model designed by imitating AlphaFold3, which can be used for protein - ligand complex structure prediction, and its code and model are publicly available. 2. **Verify the practicality of predicted structures**: - The paper aims to evaluate the performance of HelixFold3 - predicted holo and apo structures in FEP calculations, especially in practical application scenarios, whether these predicted structures can replace crystal structures for accurate binding free - energy calculations. - By comparing with AlphaFold2 and other existing methods (such as ColabFold, RoseTTAFold All - Atom, Umol), the performance of HelixFold3 in downstream tasks is verified. 3. **Evaluate the prediction ability for unknown ligands**: - The paper also evaluates the prediction ability of HelixFold3 for new ligands not included in the training data, in order to test its universality and robustness in actual drug discovery. ### Specific research contents - **Dataset selection**: Use Wang et al.'s FEP benchmark dataset, including 8 target proteins and their ligands. - **Structure prediction**: Predict the holo and apo structures of each target protein with HelixFold3, and compare them with the crystal structures. - **FEP calculation**: Use the Cresset Flare FEP tool to perform FEP calculations on the predicted structures and crystal structures, and evaluate the prediction accuracy of binding free energy. - **Performance evaluation**: Compare the prediction performance of different methods through indicators such as root - mean - square deviation (RMSD), mean unsigned error (MUE), Kendall’s τ and Pearson correlation coefficient (\( R^2 \)). ### Main conclusions - HelixFold3 shows higher binding site prediction accuracy when predicting holo structures than apo structures. - The results of FEP calculations using HelixFold3 - predicted holo structures are comparable to those using crystal structures, and even perform better in some cases. - For new ligands not included in the training data, HelixFold3 can also provide relatively accurate predictions, showing its potential in actual drug discovery. Through these studies, the paper verifies the effectiveness and practicality of HelixFold3 in protein - ligand complex structure prediction and subsequent FEP calculations, providing new tools and methods for early drug discovery.