Spaln3: improvement in speed and accuracy of genome mapping and spliced alignment of protein query sequences

Osamu Gotoh
DOI: https://doi.org/10.1093/bioinformatics/btae517
IF: 5.8
2024-08-02
Bioinformatics
Abstract:Motivation: Spaln is the earliest practical tool for self-sufficient genome mapping and spliced alignment of protein query sequences onto a mammalian-sized eukaryotic genomic sequence. However, its computational speed has become inadequate for the analysis of rapidly growing genomic and transcript sequence data. Results: The dynamic programming calculation of Spaln has been sped up in two ways: (i) the introduction of the multi-intermediate unidirectional Hirschberg method and (ii) SIMD-based vectorization. The new version, Spaln3, is ∼7 times faster than the latest Spaln version 2, and its gene prediction accuracy is consistently higher than that of Miniprot. Availability and implementation: https://github.com/ogotoh/spaln.
What problem does this paper attempt to address?