VizCNV: An integrated platform for concurrent phased BAF and CNV analysis with trio genome sequencing data
Haowei Du,Ming Yin Lun,Lidiia Gagarina,Michele G. Mehaffey,James Paul Hwang,Shalini N. Jhangiani,Sravya V. Bhamidipati,Donna Marie Muzny,M Cecilia Poli,Sebastian Ochoa Gonzalez,Ivan K. Chinn,Anna Linstrand,Jennifer E. Posey,Richard A. Gibbs,James R. Lupski,Claudia M.B. Carvalho
DOI: https://doi.org/10.1101/2024.10.27.620363
2024-10-29
Abstract:Copy number variation (CNV) is a class of genomic Structural Variation (SV) that underlie genomic disorders and can have profound implications for health. Short-read genome sequencing (sr-GS) enables CNV calling for genomic intervals of variable size and across multiple phenotypes. However, unresolved challenges include an overwhelming number of false-positive calls due to systematic biases from non-uniform read coverage and collapsed calls resulting from the abundance of paralogous segments and repetitive elements in the human genome. Methods: To address these interpretative challenges, we developed VizCNV. The VizCNV computational tool for inspecting CNV calls uses various data signal sources from sr-GS data, including read depth, phased B-allele frequency, as well as benchmarking signals from other SV calling methods. The interactive features and view modes are adept for analyzing both chromosomal abnormalities [e.g., aneuploidy, segmental aneusomy, and chromosome translocations], gene exonic CNV and non-coding gene regulatory regions. In addition, VizCNV includes a built-in filter schema for trio genomes, prioritizing the detection of impactful germline CNVs, such as de novo CNVs. Upon computational optimization by fine-tuning parameters to maximize sensitivity and specificity, VizCNV demonstrated approximately 83.8% recall and 77.2% precision on the 1000 Genome Project data with an average coverage read depth of 30x. Results: We applied VizCNV to 39 families with primary immunodeficiency disease without a molecular diagnosis. With implemented build-in filter, we identified two de novo CNVs and 90 inherited CNVs >10 kb per trio. Genotype-phenotype analyses revealed that a compound heterozygous combination of a paternal 12.8 kb deletion of exon 5 and a maternal missense variant allele of DOCK8 are likely the molecular cause of one proband. Conclusions: VizCNV provides a robust platform for genome-wide relevant CNV discovery and visualization of such CNV using sr-GS data.
Genomics