A Bidirectional Fuzzy Index and Approximate Search Algorithm for Next Generation Sequencing

Wei Quan,Guangri Quan,Bo Liu,Yadong Wang
DOI: https://doi.org/10.1109/BIBM47256.2019.8982961
2019-01-01
Abstract:Sequence alignment is one of the most important problems in bioinformatics. However, the existing alignment tools may result in a large number of candidate locations which degradate alignment performance. Recent researches discard the high repetitive seeds for improving the alignment speed, which influences alignment accuracy. To this end, we propose a novel fuzzy index ( fBWT), which is available at https://github.com/weiquan/appr_bwt. It allows approximate search and extending the length of seeds to reduce the candidate locations and accelerate the sequence alignment. The performance of our tool was compared with BWA using 150bp and 250bp length datasets. The result shows the number of misaligned reads (the correct position is not included in the high-score candidate position set) of the current mainstream tool is 5-10 times higher than it. The efficiency between the presented and the existing tools are also compared. Under the above conditions, the alignment time of the presented tool is very close. However, the alignment speed of fBWT is much faster than BWA under the requirement of similar alignment accuracy.
What problem does this paper attempt to address?