Dissecting the Reduced Penetrance of Putative Loss-of-Function Variants in Population-Scale Biobanks

David R Blair,Neil Risch
DOI: https://doi.org/10.1101/2024.09.23.24314008
2024-10-07
Abstract:Loss-of-function variants (LoFs) disrupt the activity of their impacted gene. They are often associated with clinical phenotypes, including autosomal dominant diseases driven by haploinsufficiency. Recent analyses using biobanks have suggested that LoF penetrance for some haploinsufficient disorders may be low, an observation that has important implications for population genomic screening. However, biobanks are also rife with missing data, and the reliability of these findings remains uncertain. Here, we examine the penetrance of putative LoFs (pLoFs) using a cohort of approximately 24,000 carriers derived from two population-scale biobanks: the UK Biobank and the All of Us Research Program. We investigate several possible etiologies for reduced pLoF penetrance, including biobank recruitment biases, annotation artifacts, missed diagnoses, and incomplete clinical records. Systematically accounting for these factors increased penetrance, but widespread reduced penetrance remained. Therefore, we hypothesized that other factors must be driving this phenomenon. To test this, we trained machine learning models to identify pLoFs with high penetrance using the genomic features specific to each variant. These models were predictive of penetrance across a range of diseases and pLoF types, including those with prior evidence for pathogenicity. This suggests that reduced pLoF penetrance is in fact common, and care should be taken when counseling asymptomatic carriers.
Genetic and Genomic Medicine
What problem does this paper attempt to address?