Classification of MLH1 Missense VUS Using Protein Structure-Based Deep Learning-Ramachandran Plot-Molecular Dynamics Simulations Method

Benjamin Tam,Zixin Qin,Bojin Zhao,Siddharth Sinha,Chon Lok Lei,San Ming Wang
DOI: https://doi.org/10.3390/ijms25020850
IF: 5.6
2024-01-11
International Journal of Molecular Sciences
Abstract:Pathogenic variation in DNA mismatch repair (MMR) gene MLH1 is associated with Lynch syndrome (LS), an autosomal dominant hereditary cancer. Of the 3798 MLH1 germline variants collected in the ClinVar database, 38.7% (1469) were missense variants, of which 81.6% (1199) were classified as Variants of Uncertain Significance (VUS) due to the lack of functional evidence. Further determination of the impact of VUS on MLH1 function is important for the VUS carriers to take preventive action. We recently developed a protein structure-based method named "Deep Learning-Ramachandran Plot-Molecular Dynamics Simulation (DL-RP-MDS)" to evaluate the deleteriousness of MLH1 missense VUS. The method extracts protein structural information by using the Ramachandran plot-molecular dynamics simulation (RP-MDS) method, then combines the variation data with an unsupervised learning model composed of auto-encoder and neural network classifier to identify the variants causing significant change in protein structure. In this report, we applied the method to classify 447 MLH1 missense VUS. We predicted 126/447 (28.2%) MLH1 missense VUS were deleterious. Our study demonstrates that DL-RP-MDS is able to classify the missense VUS based solely on their impact on protein structure.
biochemistry & molecular biology,chemistry, multidisciplinary
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: how to classify missense variants in the MLH1 gene, especially those Variants of Uncertain Significance (VUS), through protein structure analysis methods. Specifically, the authors developed a protein - structure - based deep - learning method - DL - RP - MDS (Deep Learning - Ramachandran Plot - Molecular Dynamics Simulations) to evaluate the impact of missense VUS in the MLH1 gene and classify them as either deleterious or harmless variants. ### Background - **MLH1 gene**: MLH1 is one of the DNA mismatch repair (MMR) genes, and its functional defect is associated with Lynch syndrome (LS), which is an inherited colorectal cancer. - **VUS problem**: Among the 3,798 MLH1 germline variants in the ClinVar database, 1,469 are missense variants, of which 1,199 are classified as Variants of Uncertain Significance (VUS). Carriers of these VUS cannot determine their cancer risk and thus cannot take appropriate preventive measures. - **Limitations of existing methods**: Current functional classification methods have limitations when dealing with a large number of genetic variants, especially in balancing sensitivity and specificity. ### Methods - **DL - RP - MDS method**: - **Ramachandran plot**: Used to map the impact of unclassified missense variants on protein structure stability. - **Molecular dynamics simulations**: Simulate the impact of amino acid changes caused by missense variants on the dynamic trajectories of protein atoms. - **Deep learning**: Combine auto - encoder and multi - layer neural network classifier to compress the information in the Ramachandran plot into a low - dimensional latent space for variant classification. ### Results - **Dataset**: Used 44 known pathogenic variants, 8 known benign variants and 447 VUS variants. - **Model training**: Compensated for the insufficient number of benign variants through 1 - microsecond wild - type MLH1 simulation. - **Classification results**: The DL - RP - MDS method predicted that 126 out of 447 VUS (28.2%) were deleterious variants. ### Significance - **Improve accuracy**: The DL - RP - MDS method significantly improves the accuracy and efficiency of VUS classification. - **Clinical application**: This method can help VUS carriers better understand their cancer risk and thus take appropriate preventive measures. In conclusion, this study provides a new method for classifying missense VUS in the MLH1 gene by combining protein structure analysis and deep - learning techniques, which is helpful for improving the functional classification of genetic variants.