Abstract:Several computer programs are available for detecting copy number variants (CNVs) using genome-wide SNP arrays. We evaluated the performance of four CNV detection software suites-Birdsuite, Partek, HelixTree, and PennCNV-Affy-in the identification of both rare and common CNVs. Each program's performance was assessed in two ways. The first was its recovery rate, i.e., its ability to call 893 CNVs previously identified in eight HapMap samples by paired-end sequencing of whole-genome fosmid clones, and 51,440 CNVs identified by array Comparative Genome Hybridization (aCGH) followed by validation procedures, in 90 HapMap CEU samples. The second evaluation was program performance calling rare and common CNVs in the Bipolar Genome Study (BiGS) data set (1001 bipolar cases and 1033 controls, all of European ancestry) as measured by the Affymetrix SNP 6.0 array. Accuracy in calling rare CNVs was assessed by positive predictive value, based on the proportion of rare CNVs validated by quantitative real-time PCR (qPCR), while accuracy in calling common CNVs was assessed by false positive/false negative rates based on qPCR validation results from a subset of common CNVs. Birdsuite recovered the highest percentages of known HapMap CNVs containing >20 markers in two reference CNV datasets. The recovery rate increased with decreased CNV frequency. In the tested rare CNV data, Birdsuite and Partek had higher positive predictive values than the other software suites. In a test of three common CNVs in the BiGS dataset, Birdsuite's call was 98.8% consistent with qPCR quantification in one CNV region, but the other two regions showed an unacceptable degree of accuracy. We found relatively poor consistency between the two "gold standards,'' the sequence data of Kidd et al., and aCGH data of Conrad et al. Algorithms for calling CNVs especially common ones need substantial improvement, and a "gold standard'' for detection of CNVs remains to be established.

Exomehmm: A Hidden Markov Model for Detecting Copy Number Variation Using Whole-Exome Sequencing Data

Detection of Copy Number Variants and Loss of Heterozygosity from Impure Tumor Samples Using Whole Exome Sequencing Data.

Erds-Pe: A Paired Hidden Markov Model for Copy Number Variant Detection from Whole-Exome Sequencing Data

Accuracy Of Cnv Detection From Gwas Data

Exome sequencing identified six copy number variations as a prediction model for recurrence of primary prostate cancers with distinctive prognosis

Modeling Read Counts for Cnv Detection in Exome Sequencing Data

SeqCNV: a Novel Method for Identification of Copy Number Variations in Targeted Next-Generation Sequencing Data

Evaluation of Somatic Copy Number Estimation Tools for Whole-Exome Sequencing Data

Copy Number Analysis Of Whole-Genome Data Using Bic-Seq2 And Its Application To Detection Of Cancer Susceptibility Variants

Clonecna: Detecting Subclonal Somatic Copy Number Alterations in Heterogeneous Tumor Samples from Whole-Exome Sequencing Data

HaplotypeCN: Copy Number Haplotype Inference with Hidden Markov Model and Localized Haplotype Clustering

PEcnv: accurate and efficient detection of copy number variations of various lengths

Allele-specific Copy-Number Discovery from Whole-Genome and Whole-Exome Sequencing

DL-CNV: A Deep Learning Method for Identifying Copy Number Variations Based on Next Generation Target Sequencing

Copy Number Variation Detection In Whole-Genome Sequencing Data Using The Bayesian Information Criterion

Evaluation of tools for identifying large copy number variations from ultra-low-coverage whole-genome sequencing data

BMI-CNV: A Bayesian framework for multiple genotyping platforms detection of copy number variation

Copy Number Variation Detection Using Total Variation

SCCNV: A Software Tool for Identifying Copy Number Variation From Single-Cell Whole-Genome Sequencing

nbCNV: a multi-constrained optimization model for discovering copy number variants in single-cell sequencing data

CNVbd: A Method for Copy Number Variation Detection and Boundary Search