MUSIC: A Hybrid Computing Environment for Burrows-Wheeler Alignment for Massive Amount of Short Read Sequence Data

Saurabh Gupta,Sanjoy Chaudhury 'and' Binay Panda
DOI: https://doi.org/10.48550/arXiv.1402.0632
2014-02-04
Abstract:High-throughput DNA sequencers are becoming indispensable in our understanding of diseases at molecular level, in marker-assisted selection in agriculture and in microbial genetics research. These sequencing instruments produce enormous amount of data (often terabytes of raw data in a month) that requires efficient analysis, management and interpretation. The commonly used sequencing instrument today produces billions of short reads (upto 150 bases) from each run. The first step in the data analysis step is alignment of these short reads to the reference genome of choice. There are different open source algorithms available for sequence alignment to the reference genome. These tools normally have a high computational overhead, both in terms of number of processors and memory. Here, we propose a hybrid-computing environment called MUSIC (Mapping USIng hybrid Computing) for one of the most popular open source sequence alignment algorithm, BWA, using accelerators that show significant improvement in speed over the serial code.
Genomics
What problem does this paper attempt to address?