Quantify genetic variants’ regulatory potential via a hybrid sequence-oriented model

Yu Wang,Nan Liang,Ge Gao
DOI: https://doi.org/10.1101/2024.03.28.587115
2024-04-01
Abstract:Understanding how noncoding DNA determines gene expression is critical for decoding the functional genome. Leveraging a hybrid sequence-oriented architecture, we developed SVEN to model (and predict) tissue-specific transcriptomic impacts for large-scale structural variants across over 200 tissues and cell lines. We expect that SVEN will enable more effective analysis and interpretation of human genome-wide disease-related genetic variants.
Bioinformatics
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to quantify the regulatory potential of gene variants through a hybrid sequence - oriented model, especially the impact of large - scale structural variants (SVs) on gene expression in more than 200 tissues and cell lines. Specifically, the researchers developed a model named SVEN, aiming to conduct computer - based analysis and interpretation of genetic variants related to human whole - genome diseases more effectively, and in particular, to provide new insights into the key issue of how non - coding DNA determines gene expression. SVEN achieves this goal by combining methods such as deep - learning networks and gradient - boosting trees, learning regulatory grammar from promoter - proximal sequences, and inferring gene - expression levels in a tissue - specific manner.