Abstract:Rare diseases are usually chronically debilitating or even life-threatening with diagnostic and therapeutic challenges in current clinical practice. It has been estimated that 80% of rare diseases are genetic in origin, and thus genome sequencing-based diagnosis offers a promising alternative for rare-disease management. In this study, 79 individuals from 16 independent families were performed for whole-genome sequencing (WGS) in an effort to identify the causative mutations for 16 distinct rare diseases that are largely clinically intractable. Comprehensive analysis of variations, including simple nucleotide variants (SNVs), copy-number variations (CNVs), and structural variations (SVs), was implemented using the WGS data. A flexible analysis pipeline that allowed a certain degree of misclassification of disease status was developed to facilitate the identification of causative variants. As a result, disease-causing variants were identified in 10 of the 16 investigated diseases, yielding a diagnostic rate of 62.5%. Additionally, new potentially pathogenic variants were discovered for two disorders, including IGF2/INS-IGF2 in mitochondrial disease and FBN3 in Klippel–Trenaunay–Weber syndrome. Our WGS analysis not only detected a CNV associated with 3p deletion syndrome but also captured a simple sequence repeat (SSR) variation associated with Machado–Joseph disease. To our knowledge, this is the first time the clinical WGS analysis of short-read sequences has been used successfully to identify a causative SSR variation that perfectly segregates with a repeat expansion disorder. After the WGS analysis, we confirmed the initial diagnosis for three of 10 established disorders and modified or corrected the initial diagnosis for the remaining seven disorders. In summary, clinical WGS is a powerful tool for the diagnosis of rare diseases, and its diagnostic clarity at molecular levels offers important benefits for the participating families.

VCGDB: a dynamic genome database of the Chinese population

A comprehensive whole genome database of ethnic minority populations

PGG.SV: a whole-genome-sequencing-based structural variant resource and data analysis platform

PGG.Han: the Han Chinese Genome Database and Analysis Platform.

GPCEG-A database for Genomic Polymorphism of Chinese Ethnic Groups

The Complete and Fully-Phased Diploid Genome of a Male Han Chinese.

The Yh Database: The First Asian Diploid Genome Database

PGG.SNV: Understanding the Evolutionary and Medical Implications of Human Single Nucleotide Variations in Diverse Populations

Genomic Analyses of 10,376 Individuals Provides Comprehensive Map of Genetic Variations, Structure and Reference Haplotypes for Chinese Population

T2T-YAO: A Telomere-to-telomere Assembled Diploid Reference Genome for Han Chinese

A comprehensive repository of mutation data and a clinical assistant decision system for hemoglobinopathy in the Chinese population

Assessing genome-wide copy number variation in the Han Chinese population.

A harmonized public resource of deeply sequenced diverse human genomes

Diagnostic and Clinical Utility of Whole Genome Sequencing in a Cohort of Undiagnosed Chinese Families with Rare Diseases

PGG.Population: a database for understanding the genomic diversity and genetic ancestry of human populations.

Enhancing Variant Calling in Whole-Exome Sequencing Data Using Population-Matched Reference Genomes.

Enhancing Variant Calling in Whole Exome Sequencing (WES) Data Using Population-Matched Reference Genomes

Dbwgfp: A Database and Web Server of Human Whole-Genome Single Nucleotide Variants and Their Functional Predictions

Genetic Landscape of Human Mitochondrial Genome Using Whole-Genome Sequencing.

AGIDB: a versatile database for genotype imputation and variant decoding across species