CNVABNN: An AdaBoost algorithm and neural networks-based detection of copy number variations from NGS data

Xuan Wang,Junqing Li,Tihao Huang
DOI: https://doi.org/10.1016/j.compbiolchem.2022.107720
IF: 3.737
2022-08-01
Computational Biology and Chemistry
Abstract:Copy number variation (CNV) is a non-negligible structural variation on the genome. And next-generation sequencing (NGS) technology is widely used to detect CNVs due to the feature of high throughput and low cost on the whole genome. Based on the original MFCNV method, this paper proposes an improved CNV detection method, which is called CNVABNN. In comparison to the MFCNV method, CNVABNN has three advantages: (1) It adds detectable categories, and refines the categories of loss into hemi_loss and homo_loss. (2) It utilizes the idea of integrated learning. The AdaBoost algorithm is used as the core framework and neural networks are used as weak classifiers, then CNVABNN combines all of the weak classifiers into a strong classifier. The overall performance of CNV detection is improved by using the strong classifier. (3) The detection is optimized by predicting CNVs twice through neural networks and voting mechanisms. To evaluate the performance of CNVABNN, six existing detection methods are used for comparison. The experimental results show that CNVABNN achieves better results in terms of precision, sensitivity, and F1-score for both simulated and real samples.
biology,computer science, interdisciplinary applications
What problem does this paper attempt to address?