Improved protein complex prediction with AlphaFold-multimer by denoising the MSA profile

Patrick Bryant,Frank Noé
DOI: https://doi.org/10.1371/journal.pcbi.1012253
2024-07-26
PLoS Computational Biology
Abstract:Structure prediction of protein complexes has improved significantly with AlphaFold2 and AlphaFold-multimer (AFM), but only 60% of dimers are accurately predicted. Here, we learn a bias to the MSA representation that improves the predictions by performing gradient descent through the AFM network. We demonstrate the performance on seven difficult targets from CASP15 and increase the average MMscore to 0.76 compared to 0.63 with AFM. We evaluate the procedure on 487 protein complexes where AFM fails and obtain an increased success rate (MMscore>0.75) of 33% on these difficult targets. Our protocol, AFProfile, provides a way to direct predictions towards a defined target function guided by the MSA. We expect gradient descent over the MSA to be useful for different tasks. AI networks can now predict the structure of protein complexes with high accuracy in the majority of cases. The accuracy of the predicted protein complexes is directly related to the quality of the input information. However, this information can be very noisy making the output of varying quality. An interesting finding is that AI networks used for structure prediction tend to know when wrong predictions are made based on confidence in the predictions themselves. Together, this suggests that one can look for more useful input information with the predicted confidence from the AI network. To improve the structure prediction of protein complexes, we here learn how to filter the input information so that AlphaFold-multimer can use it better based on the predicted confidence. We show that it is possible to do this efficiently and improve the structures in 33% of cases where AlphaFold-multimer struggles. The same filtering procedure can be used for other tasks as well, e.g. to search for alternative conformations although this remains to be studied.
biochemical research methods,mathematical & computational biology
What problem does this paper attempt to address?