Deletion Detection Method Using the Distribution of Insert Size and a Precise Alignment Strategy

Zhen Zhang,Junwei Luo,Juan Shang,Min Li,Fang-Xiang Wu,Yi Pan,Jianxin Wang
DOI: https://doi.org/10.1109/tcbb.2019.2934407
2021-01-01
IEEE/ACM Transactions on Computational Biology and Bioinformatics
Abstract:Homozygous and heterozygous deletions commonly exist in the human genome. For current structural variation detection tools, it is significant to determine whether a deletion is homozygous or heterozygous. However, the problems of sequencing errors, micro-homologies, and micro-insertions prohibit common alignment tools from identifying accurate breakpoint locations, and often result in detecting false structural variations. In this study, we present a novel deletion detection tool called Sprites2. Comparing with Sprites, Sprites2 makes the following modifications: (1) The distribution of insert size is used in Sprites2, which can identify the type of deletions and improve the accuracy of deletion calls. (2) A precise alignment method based on AGE (one algorithm simultaneously aligning 5' and 3' ends between two sequences) is adopted in Sprites2 to identify breakpoints, which is helpful to resolve the problems introduced by sequencing errors, micro-homologies, and micro-insertions. In order to test and verify the performance of Sprites2, some simulated and real datasets are adopted in our experiments, and Sprites2 is compared with five popular tools. The experimental results show that Sprites2 can improve the performance of deletion detection. Sprites2 can be downloaded from https://github.com/zhangzhen/sprites2.
What problem does this paper attempt to address?