A Comparative Analysis of Read Mapping and Indel Calling Pipelines for Next-Generation Sequencing Data

Jacob Porter,Jonathan Berkhahn,Liqing Zhang
DOI: https://doi.org/10.1016/b978-0-12-802508-6.00029-6
2015-01-01
Abstract:Insertions and deletions (indels) are one of the most common class of mutations in the human genome. Correctly detecting and identifying indels is important in the study of human genetics and disease. In this chapter, we evaluated the precision and recall of combinations of read mapping and indel calling software on calling short and longer indels with variable read coverage and read length on simulated data. We examined the popular read mappers BFAST, Bowtie2, BWA, and SHRIMP, and the indel callers Dindel, FreeBayes, and SNVer. Interestingly, there were interactions between read mappers and indel callers. On simulated data, the BFAST-Dindel and SHRIMP-SNVer pipelines showed superior performance in most cases. Real data from human chromosome 22 with indels determined from alternative indel pipelines were used to validate the computational pipelines and to assess run time. The SHRIMP-SNVer pipeline was the most accurate on one real data set, while pipelines with FreeBayes did poorly. We also discuss reasons for pipeline accuracy.
What problem does this paper attempt to address?