Optical genome mapping enables accurate repeat expansion testing

Bart van der Sanden,Kornelia Neveling,Syukri Shukor,Michael D. Gallagher,Joyce Lee,Stephanie L. Burke,Maartje Pennings,Ronald van Beek,Michiel Oorsprong,Ellen Kater-Baats,Eveline Kamping,Alide Tieleman,Nicol Voermans,Ingrid E. Scheffer,Jozef Gecz,Mark A. Corbett,Lisenka E.L.M. Vissers,Andy Wing Chun Pang,Alex Hastie,Erik-Jan Kamsteeg,Alexander Hoischen
DOI: https://doi.org/10.1101/2024.04.19.590273
2024-04-22
Abstract:Short tandem repeats (STRs) are amongst the most abundant class of variations in human genomes and are meiotically and mitotically unstable which leads to expansions and contractions. STR expansions are frequently associated with genetic disorders, with the size of expansions often correlating with the severity and age of onset. Therefore, being able to accurately detect the total repeat expansion length and to identify potential somatic repeat instability is important. Current standard of care (SOC) diagnostic assays include laborious repeat-primed PCR-based tests as well as Southern blotting, which are unable to precisely determine long repeat expansions and/or require a separate set-up for each locus. Sequencing-based assays have proven their potential for the genome-wide detection of repeat expansions but have not yet replaced these diagnostic assays due to their inaccuracy to detect long repeat expansions (short-read sequencing) and their costs (long-read sequencing). Here, we tested whether optical genome mapping (OGM) can efficiently and accurately identify the STR length and assess the stability of known repeat expansions. We performed OGM for 85 samples with known clinically relevant repeat expansions in , and , causing myotonic dystrophy type 1 and 2 and cerebellar ataxia, neuropathy and vestibular areflexia syndrome (CANVAS), respectively. After performing OGM, we applied three different repeat expansion detection workflows, manual assembly, local guided assembly (local-GA) and molecule distance script of which the latter two were developed as part of this study. The first two workflows estimated the repeat size for each of the two alleles, while the third workflow was used to detect potential somatic instability. The estimated repeat sizes were compared to the repeat sizes reported after the SOC and concordance between the results was determined. All except one known repeat expansions above the pathogenic repeat size threshold were detected by OGM, and allelic differences were distinguishable, either between wildtype and expanded alleles, or two expanded alleles for recessive cases. An apparent strength of OGM over current SOC methods was the more accurate length measurement, especially for very long repeat expansion alleles, with no upper size limit. In addition, OGM enabled the detection of somatic repeat instability, which was detected in 9/30 , 23/25 and 4/30 samples, leveraging the analysis of intact, native DNA molecules. In conclusion, for tandem repeat expansions larger than ∼300 bp, OGM provides an efficient method to identify exact repeat lengths and somatic repeat instability with high confidence across multiple loci simultaneously, enabling the potential to provide a significantly improved and generic genome-wide assay for repeat expansion disorders.
Genomics
What problem does this paper attempt to address?