Abstract:Background: Residue-residue interactions play important roles in functional and spatial relationship of proteins. These interactions are usually related to the sequence but display close proximity within three-dimensional structure. In the past few years, identifying residue-residue contacts in proteins is an important prediction problem.Objective: Many methods extract contact information from multiple sequence alignments (MSAs). Existing methods associated with MSAs are derived from homologous protein sequences. However, they need a large number of homologous protein sequences, average of about several thousand, for residue-residue contact prediction.Method: In this article, we use both phylogenetic information and amino acid frequency to predict residue-residue contacts, based on small size of MSAs. In order to better reflect evolutionary information, we combine the evolutionary distance matrix and the similarity matrix and produce a novel score to filter some noise, based on amino acid frequency. We use the above information to estimate correlation coefficient between each pair of sites from one target protein family, and extract binding sites with high values of final correlative score.Results: First, we present statistical analysis of correlative relationship on residue-residue contact. Second, we evaluate our method on 150 benchmark proteins to predict residue-residue contact. Third, we identify protein-protein interaction in bacterial signal transduction. Experiments show that our method is very effective in real applications.Conclusion: In the case of less protein sequences, experimental results confirm that the performance of our method is better than some currently popular methods. We reduce the number of homologous proteins. Therefore, the computing time to construct phylogenetic trees decreases significantly. On 150 benchmark proteins, our method achieves overall precisions of 68%, 64%, 54% and 45% in the top L/10, L/5, L/2 and L ranked, respectively. The performance of our method is better than the normalized Mutual Information scoring with sequence weighting and the Bayesian approach of Burger & van Nimwegen (B&vN).

How pairwise coevolutionary models capture the collective residue variability in proteins

From principal component to direct coupling analysis of coevolution in proteins: Low-eigenvalue modes are needed for structure prediction

Direct-coupling analysis of residue co-evolution captures native contacts across many protein families

Inter-residue, inter-protein and inter-family coevolution: bridging the scales

Simultaneous identification of specifically interacting paralogs and inter-protein contacts by Direct-Coupling Analysis

A new formulation of protein evolutionary models that account for structural constraints

On the Accuracy of Inferring Energetic Coupling Between Distant Sites in Protein Families from Evolutionary Imprints: Illustrations Using Lattice Model.

Identifying Coevolution Between Amino Acid Residues in Protein Families: Advances in the Improvement and Evaluation of Correlated Mutation Algorithms

CopulaNet: Learning Residue Co-Evolution Directly from Multiple Sequence Alignment for Protein Structure Prediction

Selection of sequence motifs and generative Hopfield-Potts models for protein familiesilies

Protein structure prediction from sequence variation

Inter-protein sequence co-evolution predicts known physical interactions in bacterial ribosomes and the trp operon

Identification Of Residue-Residue Contacts Using A Novel Coevolution-Based Method

What Does Evolution Tell Us About The Structure Of A Functional Amyloid Protein?

Deep Virtual Compton Scattering and the Nucleon Generalized Parton Distributions

Sequence co-evolution gives 3D contacts and structures of protein complexes

Capturing coevolutionary signals in repeat proteins

Direct Coupling Analysis of Epistasis in Allosteric Materials

Membrane protein contact and structure prediction using co-evolution in conjunction with machine learning

Aligning biological sequences by exploiting residue conservation and coevolution

Co-evolution Transformer for Protein Contact Prediction.