Improved Detection Algorithm for Copy Number Variations Based on Hidden Markov Model

Hai Yang,Daming Zhu
DOI: https://doi.org/10.1007/s11042-019-7368-z
IF: 2.577
2019-01-01
Multimedia Tools and Applications
Abstract:Aiming at the problems of parameter optimization and insufficient utilization of split reads in the detection for copy number variation (CNV), a new definition of relative read depth (RRD) and a randomized sampling strategy (RGN) are proposed in this paper. Compared to the raw read depth, the RRD parameter has weak correlation with GC content, mappability and the width of analysis windows tiled along the genome. The RGN strategy is based on the weighted sampling strategy which can speed up the read count data analysis. Subsequently, we propose an improved detection algorithm for CNV based on hidden Markov model (CNV-HMM). The HMM detects the abnormal signal of read count data and outputs the detection results of candidate CNVs. At the end of the algorithm, we filter out the results of candidate CNVs using the split reads to improve the performance of CNV-HMM algorithm. Finally, the experiment results show that our CNV-HMM algorithm has higher sensitivity and accuracy for CNVs detection than most of current detection algorithms and applicative both for diploid animal and plant.
What problem does this paper attempt to address?