Fast construction of FM-index for long sequence reads

Heng Li
DOI: https://doi.org/10.1093/bioinformatics/btu541
2014-06-03
Abstract:Summary: We present a new method to incrementally construct the FM-index for both short and long sequence reads, up to the size of a genome. It is the first algorithm that can build the index while implicitly sorting the sequences in the reverse (complement) lexicographical order without a separate sorting step. The implementation is among the fastest for indexing short reads and the only one that practically works for reads of averaged kilobases in length. Availability and implementation: <a class="link-external link-https" href="https://github.com/lh3/ropebwt2" rel="external noopener nofollow">this https URL</a> Contact: hengli@broadinstitute.org
Genomics,Data Structures and Algorithms
What problem does this paper attempt to address?