Deep-Learning Structure Elucidation from Single-Mutant Deep Mutational Scanning

Zachary C Drake,Elijah Day,Paul Toth,Steffen C Lindert
DOI: https://doi.org/10.1101/2024.05.19.594322
2024-05-19
Abstract:Deep learning has revolutionized the field of protein structure prediction. AlphaFold2, a deep neural network, vastly outperformed previous algorithms to provide near atomic-level accuracy when predicting protein structures. Despite its success, there still are limitations which prevent accurate predictions for numerous protein systems. Here we show that sparse residue burial restraints from deep mutational scanning (DMS) can refine AlphaFold2 to significantly enhance results. Burial information extracted from DMS is used to explicitly guide residue placement during structure generation. DMS-Fold was validated on both simulated and experimental single-mutant DMS, with DMS-Fold outperforming AlphaFold2 for 89% of protein targets and with 253 proteins having an improvement greater than 0.1 in TM-score. DMS-Fold is free and publicly available: https://github.com/LindertLab/DMS-Fold.
Biochemistry
What problem does this paper attempt to address?
This paper aims to address the limitations in protein structure prediction. Although AlphaFold2 has achieved great success in protein structure prediction, it still faces challenges in dealing with dynamic proteins, predicting mutation effects, orphan proteins or disordered proteins, etc. For this reason, the paper proposes DMS - Fold, a deep - learning model that combines single - point mutation deep mutational scanning (DMS) data to improve the prediction results of AlphaFold2. Specifically, the main contributions of the paper include: 1. **Extract buried information**: Analyze large - scale single - point mutation DMS data to extract the buried information of protein residues. This information can reflect the relative position of residues inside the protein, thus guiding the generation of protein structures. 2. **Develop DMS - Fold**: Based on the AlphaFold2 framework, develop a new deep neural network DMS - Fold. This model embeds the buried information in DMS data into the pair representation of AlphaFold2 to guide the position arrangement of residues, thereby improving the prediction accuracy. 3. **Verify the effectiveness of DMS - Fold**: Verify the performance of DMS - Fold through simulation and experimental data. The results show that DMS - Fold outperforms AlphaFold2 on 89% of protein targets, and the TM - score of 253 protein targets has increased by more than 0.1. 4. **Explore the influence of MSA**: Study the influence of different levels of MSA sampling on the prediction results of DMS - Fold and AlphaFold2. It is found that in the case of shallower MSA, DMS - Fold is more dependent on buried information and thus shows better performance. In conclusion, this paper proposes a new protein structure prediction model DMS - Fold by combining DMS data and deep - learning methods, which significantly improves the accuracy of protein structure prediction, especially when dealing with some complex protein systems.