Editorial: The clinical utility of long read sequencing to improve diagnostic yield and uncover biological mechanisms in rare disease

Lidia Larizza,Christopher M. Watson,Madelyn A. Gillentine,Palma Finelli
DOI: https://doi.org/10.3389/fgene.2024.1494860
IF: 3.7
2024-10-08
Frontiers in Genetics
Abstract:Long Read Sequencing (LRS), a multi-omics technology impacting genomics, epigenomics, and transcriptomics overcomes the limitations of short read exome/genome sequencing (SRS) and other second generation techniques in disclosing the hidden basis of rare genetic diseases (Mastrorosa et al, 2023;Yu et al, 2023;Kernohan and Boycott 2024).The figure summarizes LRS-unique properties such as the ability to reveal the precise configuration of simple and complex structural variants, such as those originating from chromothripsis, the genomic configuration of genes and pseudogenes, the unique episignature of diseases, the unstable expansions of tandem repeats disorders, the epi-transcriptomic modifications of imprinting disorders and variants in the intronic regions which may cause abnormal splicing of transcripts as well as new transcripts. These capabilities of LRS increase the unsatisfactory SRS diagnostic yield (25-50%) (Sullivan et al 2023). Currently, LRS is mainly applied in the research context and is not yet available in the clinical setting.Our themed research proposal is a bit of straw to promote the application of LRS in the clinical setting, prioritizing groups of rare diseases in which molecular mechanisms are hard to identify by second generation sequencing technologies.The article by Ura H et al. addresses the development of target capture full length double-stranded cDNA sequencing by nanopore LRS to uncover intronic variants in the Tuberous Sclerosis type 1 and type 2 (TSC1 and TSC2) genes in a clinically affected but molecularly undiagnosed individual. The occurrence of deep intronic variants generates novel transcripts with intron retention leading to a truncated protein and a decreased potential of full-length isoforms in respect to healthy controls. The Authors define the repertoire (number, coverage, exon number, transcript length) of TSC1 and TSC2 transcripts and focus on the "protein coding" transcripts. Reduced expression of such transcripts leads to identification of a TSC2 variant in the proband, which is then validated by an in vitro assay. Besides providing a diagnosis for this individual, the delineated multi-step experimental pathway confers the essential information to monitor the full-length alternative splicing of transcripts for the diagnosis of genetic diseases.Another shortcoming of second-generation sequencing techniques, even coupled to high resolution array-based comparative genomic hybridization (a-CGH), is the recognition and resolution of complex structural variants.The article by Bestetti I et al. addresses this challenging issue by target Oxford Nanopore sequencing on a proband with a clinical diagnosis of Cornelia de Lange syndrome (CDLS), who by first-tier testing was shown to have the abnormal karyotype 46,XY, t(5;15)(p13;q25)dn. FISH analyses mapped the putative translocation breakpoints on der(5) within intron 2 of ADAMTS12 gene-3 Mb from NIPBL 5'UTR-and on der(15) within intron 1 of the SEMA4B gene. While the halved amount of NIPBL transcript from exon 23 to 3'UTR assessed by RT-qPCR accounted for the clinical phenotype, only LRS unraveled the configuration and origin of the cryptic complex structural variant (cxSV). Besides confirming the previous mapping on derivative chromosomes, data analysis of ONT produced 28.3 Gb showed the signature of a previous chromothripsis event on der(5) leading to the shattering at 5p13.2 of a 7.3 Mb region, comprising 44 coding genes, into 17 fragments relocated in a random order and orientation, with 36 underlying breaks. Despite the large number of coding genes, the "all at once" rearrangement on der(5) disrupted only 3 genes with a single break in ADAMTS12 and C6 and 16 breaks in NIPBL. Notably NIPBL was the main target with 16 breaks clustering between introns 21 and 41, several coinciding with repeated SINE and LINE elements and a segmental duplication at intron 21, suggesting proneness to rearrangement of this unstable region. A single breakpoint was identified on der(15) where the juxtaposition between the short arm of chromosome 5 and the long arm of chromosome 15 led to a fusion gene between SEM4B (5'UTR-intron1) and ADAMTS12 (intron 2-3' UTR), not contributing to the clinical phenotype as not transcribed. In conclusion the NIPBL gene, accounting for 50-60% of CDLS cases, is worth assessment by LRS to unravel gross rearrangements in clinically suspected, molecularly undiagnosed cases.The review article by Olivucci G et al. provides a comprehensive and critical overview of the relevant advantages of LRS to the diagnosis of rare genetic diseases predicting, as suggested by the title, a trend towards its application in the clinical context. Given the ability of LRS technologies to sequence with improved mappability of long molecules of nucleic acids (10 to 100 Kb and longer) and to evaluate different types of alterations in a single analysis, the Authors review SRS shortcomings -Abstract Truncated-
genetics & heredity
What problem does this paper attempt to address?