Abstract:Protein structure modeling and prediction have important applications throughout the biological sciences, from the design of pharmaceuticals to the elucidation of enzyme mechanisms. At the core of most protein modeling is an energy function, the minimum of which represents the free energy "cost" for forming a correct protein structure. The most commonly used energy functions are knowledge-based statistical potential functions; that is, they are empirically derived from statistical analysis of a set of high-resolution protein structures. When that kind of potential function is constructed, the anisotropic orientation dependence between the interacting groups is a critical component for accurately representing key molecular interactions, such as those involved in protein side-chain packing. In the literature, however, many potential functions are limited in their ability to describe orientation dependence. In all-atom potentials, they typically ignore heterogeneous chemical-bond connectivity. In coarse-grained potentials, such as (semi)-residue-based potentials, the simplified representation of residues often reduces the sensitivity of the potential to side-chain orientation. Recently, in an effort to maximally capture the orientation dependence in side-chain interactions, a new type of all-atom statistical potential was developed: OPUS-PSP (potential derived from side-chain packing). The key feature of this potential is its explicit description of orientation dependence in molecular interactions, which is achieved with a basis set of 19 rigid-body blocks extracted from the chemical structures of 20 amino acid residues. This basis set is specifically designed to maximally capture the essential elements of orientation dependence in molecular packing interactions. The potential is constructed from the orientation-specific packing statistics of pairs of those blocks in a nonredundant structural database. On decoy set tests, OPUS-PSP significantly outperforms most of the existing knowledge-based potentials in terms of both its ability to recognize native structures and its consistency in achieving high Z scores across decoy sets. The application of OPUS-PSP to conformational modeling of side chains has led to another method, called OPUS-Rota. In terms of combined speed and accuracy, OPUS-Rota outperforms all of the other methods in modeling side-chain conformation. In this Account, we briefly outline the basic scheme of the OPUS-PSP potential and its application to side-chain modeling via OPUS-Rota. Future perspectives on the modeling of orientation dependence are also discussed. The computer programs for OPUS-PSP and OPUS-Rota can be downloaded at http://sigler.bioch.bcm.tmc.edu/MaLab . They are free for academic users.

A Protein-Dependent Side-Chain Rotamer Library

A Curated Rotamer Library for Common Post-Translational Modifications of Proteins

The dynameomics rotamer library: Amino acid side chain conformations and dynamics from comprehensive molecular dynamics simulations in water

OPUS-Rota3: Improving Protein Side-Chain Modeling by Deep Neural Networks and Ensemble Methods

Packing Protein Side-Chains by Parallel Ant Colonies.

Energy minimization method using automata network for sequence and side-chain conformation prediction from given backbone geometry

Highly Accurate Protein Structure Classification and Prediction

Protein side chain conformation predictions with an MMGBSA energy function

Solvent Accessibility Promotes Rotamer Errors during Protein Modeling with Major Side-Chain Prediction Programs

Dependence of Sidechain Rotamer Preference on Backbone Conformation: Relative Free Energy Calculations for Valine and Leucine

A Framework for Direct Locating and Conformational Sampling of Protein Structural Motifs.

An end-to-end deep learning method for protein side-chain packing and inverse folding

H-Packer: Holographic Rotationally Equivariant Convolutional Neural Network for Protein Side-Chain Packing

OPUS-Rota5: A highly accurate protein side-chain modeling method with 3D-Unet and RotaFormer

Effective Local and Secondary Protein Structure Prediction by Combining a Neural Network-Based Approach with Extensive Feature Design and Selection without Reliance on Evolutionary Information

Explicit Orientation Dependence in Empirical Potentials and Its Significance to Side-Chain Modeling

Improving prediction of protein secondary structure, backbone angles, solvent accessibility and contact numbers by using predicted contact maps and an ensemble of recurrent and residual convolutional neural networks

Prediction of Protein Local Structures and Folding Fragments Based on Building-Block Library

Improved Packing of Protein Side Chains with Parallel Ant Colonies.

Data-efficient generation of protein conformational ensembles with backbone-to-side chain transformers