Insights into the causes and consequences of DNA repeat expansions from 700,000 biobank participants

Margaux Louise Anna Hujoel,Robert E Handsaker,Nolan Kamitaki,Ronen E Mukamel,Simone Rubinacci,Pier F. Palamara,Steven A McCarroll,Po-Ru Loh
DOI: https://doi.org/10.1101/2024.11.25.625248
2024-11-26
Abstract:Expansions and contractions of tandem DNA repeats are a source of genetic variation in human populations and in human tissues: some expanded repeats cause inherited disorders, and some are also somatically unstable. We analyzed DNA sequence data, derived from the blood cells of >700,000 participants in UK Biobank and the Research Program, and developed new computational approaches to recognize, measure and learn from DNA-repeat instability at 15 highly polymorphic CAG-repeat loci. We found that expansion and contraction rates varied widely across these 15 loci, even for alleles of the same length; repeats at different loci also exhibited widely variable relative propensities to mutate in the germline versus the blood. The high somatic instability of repeats enabled a genome-wide association analysis that identified seven loci at which inherited variants modulate repeat instability in blood cells. Three of the implicated loci contained genes ( , , and ) that also modulate Huntington's disease age-at-onset as well as somatic instability of the repeat in blood; however, the specific genetic variants and their effects (instability-increasing or -decreasing) appeared to be tissue-specific and repeat-specific, suggesting that somatic mutation in different tissues - or of different repeats in the same tissue - proceeds independently and under the control of substantially different genetic variation. Additional modifier loci included DNA damage response genes and . Analyzing DNA repeat expansions together with clinical data showed that inherited repeats in the 5' UTR of the glutaminase ( ) gene are associated with stage 5 chronic kidney disease (OR=14.0 [5.7-34.3]) and liver diseases (OR=3.0 [1.5-5.9]). These and other results point to the dynamics of DNA repeats in human populations and across the human lifespan.
Genetics
What problem does this paper attempt to address?