BRAIN-MAGNET: A novel functional genomics atlas coupled with convolutional neural networks facilitates clinical interpretation of disease relevant variants in non-coding regulatory elements

Ruizhi Deng,Elena Perenthaler,Anita Nikoncuk,Soheil Yousefi,Kristina Lanko,Rachel Schot,Michela Maresca,Eva Medico-Salsench,Leslie E. Sanderson,Michael J. Parker,Wilfred F.J. van Ijcken,Joohyun Park,Marc Sturm,Tobias B. Haack,Genomics England Research Consortium,Gennady V Roshchupkin,Eskeatnaf Mulugeta,Tahsin Stefan Barakat
DOI: https://doi.org/10.1101/2024.04.13.24305761
2024-09-24
Abstract:Genome-wide assessment of genetic variation is becoming routine in human genetics, but functional interpretation of non-coding variants both in common and rare diseases remains extremely challenging. Here, we employed the massively parallel reporter assay ChIP-STARR-seq to functionally annotate the activity of >145 thousand non-coding regulatory elements (NCREs) in human neural stem cells, modelling early brain development. Highly active NCREs show increased sequence constraint and harbour de novo variants in individuals affected by neurodevelopmental disorders. They are enriched for transcription factor (TF) motifs including YY1 and p53 family members and for primate-specific transposable elements, providing insights on gene regulatory mechanisms in NSCs. Examining episomal NCRE activity of the same sequences in human embryonic stem cells identified cell type differential activity and primed NCREs, accompanied by a rewiring of the epigenome landscape. Leveraging the experimentally measured NCRE activity and nucleotide composition of the assessed sequences, we built BRAIN-MAGNET, a functionally validated convolutional neural network that predicts NCRE activity based on DNA sequence composition and identifies functionally relevant nucleotides required for NCRE function. The application of BRAIN-MAGNET allows fine-mapping of GWAS loci identified for common neurological traits and prioritizing of possible disease-causing rare non-coding variants in currently genetically unexplained individuals with neurogenetic disorders, including those from the Genomics England 100,000 Genomes project, identifying novel enhanceropathies. We foresee that this NCRE atlas and BRAIN-MAGNET will help reduce missing heritability in human genetics by limiting the search space for functionally relevant non-coding genetic variation.
What problem does this paper attempt to address?