Abstract:Although covalent interactions determine the primary structure of a molecule, the noncovalent interactions are responsible for the tertiary and quaternary structure of a molecule and create the fascinating world of the 3D architectures of biomacromolecules. For example, the double helical structure of DNA is of fundamental importance for the function of DNA: it allows it to store and transfer genetic information. To fulfill this role, the structure is rigid to maintain the double helix with a proper positioning of the complementary base, and floppy to allow for its opening. Very strong covalent interactions cannot fulfill both of these criteria, but noncovalent interactions, which are about 2 orders of magnitude weaker, can. This Account highlights the recent advances in the field of the design of novel wave function theory (WFT) methods applicable to noncovalent complexes ranging in size from less than 100 atoms, for which highly accurate ab initio methods are available, up to extended ones (several thousands atoms), which are the domain of semiempirical QM (SQM) methods. Accurate interaction energies for noncovalent complexes are generated by the coupled-cluster technique, taking single- and double-electron excitations iteratively and triple-electron excitation perturbatively with a complete basis set description (CCSD(T)/CBS). The procedure provides interaction energies with high accuracy (error less than 1 kcal/mol). Because the method is computationally demanding, its application is limited to complexes smaller than 30 atoms. But researchers would also like to use computational methods to determine these interaction energies accurately for larger biological and nanoscale structures. Standard QM methods such as MP2, MP3, CCSD, or DFT fail to describe various types of noncovalent systems (H-bonded, stacked, dispersion-controlled, etc.) with comparable accuracy. Therefore, novel methods are needed that have been parametrized toward noncovalent interactions, and existing benchmark data sets represent an important tool for the development of new methods providing reliable characteristics of noncovalent clusters. Our laboratory developed the first suitable data set of CCSD(T)/CBS interaction energies and geometries of various noncovalent complexes, called S22. Since its publication in 2006, it has frequently been applied in parametrization and/or verification of various wave function and density functional techniques. During the intense use of this data set, several inconsistencies emerged, such as the insufficient accuracy of the CCSD(T) correction term or its unbalanced character, which has triggered the introduction of a new, broader, and more accurate data set called the S66 data set. It contains not only 66 CCSD(T)/CBS interaction energies determined in the equilibrium geometries but also 1056 interaction energies calculated at the same level for nonequilibrium geometries. The S22 and S66 data sets have been used for the verification of various WFT methods, and the lowest RMSE (S66, in kcal/mol) was found for the recently introduced SCS-MI-CCSD/CBS (0.08), MP2.5/CBS (0.16), MP2.X/6-31G* (0.27), and SCS-MI-MP2/CBS (0.38) methods. Because of their computational economy, the MP2.5 and MP2.X/6-31G* methods can be recommended for highly accurate calculations of large complexes with up to 100 atoms. The evaluation of SQM methods was based only on the S22 data set, and because some of these methods have been parametrized toward the same data set, the respective results should be taken with caution. For really extended complexes such as protein-ligand systems, only the SMQ methods are applicable. After adding the corrections to the dispersion energy and H-bonding, several methods exhibit surprisingly low RMSE (even below 0.5 kcal/mol). Among the various SMQ methods, the PM6-DH2 can be recommended because of its computational efficiency and it can be used for optimization (which is not the case for other SQM methods). The PM6-DH2 is the base of our novel scoring function used in in silico drug design.

SQM2.20: Semiempirical quantum-mechanical scoring function yields DFT-quality protein–ligand binding affinity predictions in minutes

Improving the accuracy of predicting protein-ligand binding-free energy with semiempirical quantum chemistry charge.

Calculations on noncovalent interactions and databases of benchmark interaction energies

Pbsa_e: A Pbsa-Based Free Energy Estimator for Protein-Ligand Binding Affinity

Empirical Scoring Functions for Affinity Prediction of Protein‐ligand Complexes

Further Development and Validation of Empirical Scoring Functions for Structure-Based Binding Affinity Prediction

SCORE: A New Empirical Method for Estimating the Binding Affinity of a Protein-Ligand Complex

An Extensive Test of 14 Scoring Functions Using the Pdbbind Refined Set of 800 Protein-Ligand Complexes

Accurate protein-ligand binding free energy estimation using QM/MM on multi-conformers predicted from classical mining minima

Automated Fragmentation QM/MM Calculation of NMR Chemical Shifts for Protein-Ligand Complexes.

Quick-and-Easy Validation of Protein–Ligand Binding Models Using Fragment-Based Semi-Empirical Quantum Chemistry

Chemical accuracy for ligand-receptor binding Gibbs energies through multi-level SQM/QM calculations

Protein-ligand free energies of binding from full-protein DFT calculations: convergence and choice of exchange-correlation functional

Toward On-The-Fly Quantum Mechanical/Molecular Mechanical (QM/MM) Docking: Development and Benchmark of a Scoring Function

Rapid, Accurate, Ranking of Protein-Ligand Binding Affinities with VM2, the 2nd –Generation Mining Minima Method

Comparative Evaluation of 11 Scoring Functions for Molecular Docking

Comparative studies of 14 binding free energies scoring functions

Quantum Mechanical Calculation of Noncovalent Interactions: A Large-Scale Evaluation of PMx, DFT, and SAPT Approaches

Enhance the Performance of Current Scoring Functions with the Aid of 3D Protein-Ligand Interaction Fingerprints.

Chemical accuracy for ligand-receptor binding Gibbs energies through multi-level SQM/QM calculations

PHOENIX: A Scoring Function for Affinity Prediction Derived Using High-Resolution Crystal Structures and Calorimetry Measurements