Closing the gap: Solving complex medically relevant genes at scale
Medhat Mahmoud,John Harting,Holly Corbitt,Xiao Chen,Shalini N. Jhangiani,Harsha Doddapaneni,Qingchang Meng,Tina Han,Christine Lambert,Siyuan Zhang,Primo Baybayan,Geoff Henno,Hua Shen,Jianhong Hu,Yi Han,Casey Riegler,Ginger Metcalf,Geoff Henno,Ivan K. Chinn,Michael A. Eberle,Sarah Kingan,Tim Farinholt,Claudia M.B. Carvalho,Richard A. Gibbs,Zev Kronenberg,Donna Muzny,Fritz J. Sedlazeck
DOI: https://doi.org/10.1101/2024.03.14.24304179
2024-03-18
Abstract:Comprehending the mechanism behind human diseases with an established heritable component represents the forefront of personalized medicine. Nevertheless, numerous medically important genes are inaccurately represented in short-read sequencing data analysis due to their complexity and repetitiveness or the so-called ‘dark regions’ of the human genome. The advent of PacBio as a long-read platform has provided new insights, yet HiFi whole-genome sequencing (WGS) cost remains frequently prohibitive. We introduce a targeted sequencing and analysis framework, Twist Alliance Dark Genes Panel (TADGP), designed to offer phased variants across 389 medically important yet complex autosomal genes. We highlight TADGP accuracy across eleven control samples and compare it to WGS. This demonstrates that TADGP achieves variant calling accuracy comparable to HiFi-WGS data, but at a fraction of the cost. Thus, enabling scalability and broad applicability for studying rare diseases or complementing previously sequenced samples to gain insights into these complex genes. TADGP revealed several candidate variants across all cases and provided insight into diversity when tested on samples from rare disease and cardiovascular disease cohorts. In both cohorts, we identified novel variants affecting individual disease-associated genes (e.g., ). Nevertheless, the annotation of the variants across these 389 medically important genes remains challenging due to their underrepresentation in ClinVar and gnomAD. Consequently, we also offer an annotation resource to enhance the evaluation and prioritization of these variants. Overall, we can demonstrate that TADGP offers a cost-efficient and scalable approach to routinely assess the dark regions of the human genome with clinical relevance.
Genetic and Genomic Medicine