Comparative Analysis of Structural Variant Callers on the Short-Read Whole-Genome Sequencing Data

A. A. Mkrtchian,K. S. Grammatikati,P. G. Kazakova,S. I. Mitrofanov,P. U. Zemsky,A. A. Ivashechkin,M. N. Pilipenko,D. V. Svetlichny,A. P. Sergeev,E. A. Snigir,L. V. Frolova,T. A. Shpakova,V. S. Yudin,A. A. Keskinov,S. M. Yudin,V. I. Skvortsova
DOI: https://doi.org/10.31857/s0016675823060115
2023-06-01
Генетика
Abstract:In this study three structural variant callers (Manta, Smoove, Delly) were analysed on the whole-genome sequencing data using four different alignment algorithms: DRAGEN, GDC DNA-Seq Alignment Workflow, GDC DNA-Seq Alignment Workflow + GDC DNA-Seq Co-Cleaning Workflow, NovoAlign, different lengths of raw reads: 2 × 150 bp and 2 × 250 bp, different mean genome coverage values. Results were compared to etalon results of GIAB team. Structural variants validation was hold also with Sanger sequencing. Structural variants deletions and insertions as it turned out were best determined with Manta tool. We’ve got 89–96% of accuracy and 59–70% of sensitivity for analysed deletions, and also 96–99% of accuracy and 15–36% of sensitivity for insertions. Smoove and Delly showed less accurate and sensitive results (Smoove: 91–95% of accuracy and 8–54% of sensitivity for deletions, Delly: 78–87% of accuracy and 31–66% of sensitivity for deletions, 99–100% of accuracy and 1–13% of sensitivity for insertions). Simultaneous using of two or even three structural variant callers didn’t give a rise of accuracy and sensitivity for deletions. Analysis showed that accuracy and sensitivity of structural variant callers rise with the rising of mean genome coverage value, increasing of reads length from 150 to 250 bp influence in to varying degrees on the accuracy and sensitivity of individual tools. Another inference of this study was that accuracy of structural variants callers vary depends on structural variants size range. For example, Manta finds better deletions in the range from 200 and more bp, Delly – from 1000 to 10 000 bp, Smoove – from 200 to 10 000 bp.
What problem does this paper attempt to address?