Understanding large scale sequencing datasets through changes to protein folding

David Shorthouse,Harris Lister,Gemma S Freeman,Benjamin A Hall
DOI: https://doi.org/10.1093/bfgp/elae007
2024-03-23
Briefings in Functional Genomics
Abstract:Abstract The expansion of high-quality, low-cost sequencing has created an enormous opportunity to understand how genetic variants alter cellular behaviour in disease. The high diversity of mutations observed has however drawn a spotlight onto the need for predictive modelling of mutational effects on phenotype from variants of uncertain significance. This is particularly important in the clinic due to the potential value in guiding clinical diagnosis and patient treatment. Recent computational modelling has highlighted the importance of mutation induced protein misfolding as a common mechanism for loss of protein or domain function, aided by developments in methods that make large computational screens tractable. Here we review recent applications of this approach to different genes, and how they have enabled and supported subsequent studies. We further discuss developments in the approach and the role for the approach in light of increasingly high throughput experimental approaches.
genetics & heredity,biotechnology & applied microbiology
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is the impact of gene mutations on protein folding and function, especially when interpreting variants of uncertain significance (VUS). Specifically, the paper focuses on the following points: 1. **Interpretation problems of gene mutations**: With the development of high - throughput sequencing technology, a large amount of gene mutation data has been obtained. However, how to accurately interpret the functional impacts of these mutations is a challenge. Especially in the clinical setting, many mutations are variants of uncertain significance (VUS), which complicates clinical diagnosis and the selection of treatment options. 2. **Impact of protein folding changes**: The paper emphasizes that protein misfolding caused by mutations is one of the important mechanisms in many diseases. By calculating the change in protein folding energy (\(\Delta\Delta G\)), the impact of mutations on protein structure and function can be predicted, thus providing a basis for clinical diagnosis. 3. **Application and development of computational models**: To address the above challenges, researchers have developed a variety of computational models and tools, such as FoldX, Rosetta, etc., to evaluate the impact of mutations on protein stability. These tools not only improve the understanding of individual mutations but also can screen the mutation effects of multiple genes on a large scale, helping to identify disease - related mutation patterns. 4. **Significance of clinical applications**: By combining experimental verification and computational simulation, the research results are helpful for improving clinical guidelines and supporting the classification and evaluation of gene mutations. For example, in the study of CDH1 gene mutations, the results of computational models were used to modify the clinical classification criteria, thereby improving the ability to interpret VUS. 5. **Future development directions**: The paper also discusses the possible future development directions in this field, including improving the accuracy of computational tools, expanding to more types of genes and mutations, and integrating these methods into existing machine - learning frameworks to better predict and interpret the impacts of gene mutations. In summary, this paper aims to solve the key problems in gene mutation interpretation through the means of protein folding calculation and provide a scientific basis for clinical diagnosis and treatment.