Seed-and-Vote based In-Memory Accelerator for DNA Read Mapping

Ann Franchesca Laguna,Hasindu Gamaarachchi,Xunzhao Yin,Michael T. Niemier,Sri Parameswaran,Xiaobo Sharon Hu
DOI: https://doi.org/10.1145/3400302.3415651
2020-01-01
Abstract:Genome analysis is becoming more important in the fields of forensic science, medicine, and history. Sequencing technologies such as High Throughput Sequencing (HTS) and Third Generation Sequencing (TGS) have greatly accelerated genome sequencing. However, genome read mapping remains significantly slower than sequencing. Because of the enormous amount of data needed, the speed of the data transfer between the memory and the processing unit limits the execution speed. In-memory computing can help address the memory-bandwidth bottleneck by minimizing data transfers. Ternary Content Addressable Memories (TCAMs) have been used in accelerators because of their fast searching capability for seed-and-extend, a popular read mapping approach. Seed-and-vote, another read mapping approach, is faster than the seed-and-extend approach but has lower accuracies when used with very short reads. Since sequencing technology is moving to longer reads, the seed-and-vote approach is becoming more viable. We propose a genome read mapping accelerator that uses approximate TCAM to execute the Fast Seed and Vote algorithm (FSVA) that can map both short and long reads. We achieved 400X acceleration compared to the seed-and-extend approach BWA-MEM on a CPU and 115X acceleration at 30X energy improvement compared to state-of-the-art in-memory accelerator using the seed-and-extend approach at 98.75% accuracy for 100bp reads.
What problem does this paper attempt to address?