Abstract:A key challenge in the application of whole-genome sequencing (WGS) for clinical diagnostic and research is the high-throughput prioritization of functional variants in the non-coding genome. This challenge is compounded by context-specific genetic modulation of gene expression, and variant-gene mapping depends on the tissues and organ systems affected in a given disease; for instance, a disease affecting the gastrointestinal system would use maps specific to genome regulation in gut-related tissues. While there are large-scale atlases of genome regulation, such as GTEx and NIH Roadmap Epigenomics, the clinical genetics community lacks publicly-available stand-alone software for high-throughput annotation of custom variant data with user-defined tissue-specific epigenetic maps and clinical genetic databases, to prioritize variants for a specific biomedical application. In this work, we provide a simple software pipeline, called SNPnotes, which takes as input variant calls for a patient and prioritizes those using information on clinical relevance from ClinVar, tissue-specific gene regulation from GTEx and disease associations from the NHGRI-EBI GWAS catalogue. This pipeline was developed as part of SVAI Research's "Undiagnosed-1" event for collaborative patient diagnosis. We applied this pipeline to WGS-based variant calls for an individual with a history of gastrointestinal symptoms, using 12 gut-specific eQTL maps and GWAS associations for metabolic diseases, for variant-gene mapping. Out of 6,248,584 SNPs, the pipeline identified 151 high-priority variants, overlapping 129 genes. These top SNPs all have known clinical pathogenicity, modulate gene expression in gut tissues and have genetic associations with metabolic disorders, and serve as starting points for hypotheses about mechanisms driving clinical symptoms. Simple software changes can be made to customize the pipeline for other tissue-specific applications. Future extensions could integrate maps of tissue-specific regulatory elements, higher-order chromatin loops, and mutations affecting splice variants.

Ultrafast and scalable variant annotation and prioritization with big functional genomics data

[A consensus on the standardization of the next generation sequencing process for the diagnosis of genetic diseases (3) - Data analysis].

Scalable approaches for functional analyses of whole-genome sequencing non-coding variants

VannoPortal: multiscale functional annotation of human genetic variants for interrogating molecular mechanism of traits and diseases

NanoVar: accurate characterization of patients' genomic structural variants using low-depth nanopore sequencing

VarCards: an Integrated Genetic and Clinical Database for Coding Variants in the Human Genome

ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data

FAVOR: functional annotation of variants online resource and annotator for variation across the human genome

Deep learning approaches for non-coding genetic variant effect prediction: current progress and future prospects

Incorporating functional annotation information in prioritizing disease associated SNPs from genome wide association studies

VCAT: an integrated variant function annotation tools

SNPnotes: high-throughput tissue-specific functional annotation of single nucleotide variants

VarDict: a novel and versatile variant caller for next-generation sequencing in cancer research

Interpolatory Type Construction of General Aggregation Operators

Identification Of Sequence Variants In Genetic Disease-Causing Genes Using Targeted Next-Generation Sequencing

Accurately Annotate Compound Effects of Genetic Variants Using a Context-Sensitive Framework.

Framing youth issues for public support.

The Scalable Variant Call Representation: Enabling Genetic Analysis Beyond One Million Genomes

Exploring noncoding variants in genetic diseases: From detection to functional insights

Leveraging functional genomic annotations and genome coverage to improve polygenic prediction of complex traits within and between ancestries

Toward Automatic Variant Interpretation: Discordant Genetic Interpretation Across Variant Annotations for ClinVar Pathogenic Variants