Aligning High Error Rate Reads Using Enhanced Sparse Suffix Array Index

Hao WEI,Cheng ZHONG
DOI: https://doi.org/10.3969/j.issn.1000-1220.2019.08.044
2019-01-01
Abstract:Biological sequence alignments help to locate similar regions between sequences. The rapid development of sequencing tech-nology has forced the sequence-mapping algorithm to flexibly process longer reads with higher error. The reference sequence is indexed by an enhanced sparse suffix array,and the maximum exact match and super maximum exact match between the reference sequence and the reads are found by adaptively adjusting minimum length of seeds,the seeds are expanded by these two matches,and an im-proved long-read alignment algorithm is proposed. Compared with the existing representative algorithm,the experimental result on the simulation and real data shows that the proposed algorithm significantly improves the recall rate and has totally higher sensitivity under the premise of obtaining basically same accuracy,and it can identify more reads.
What problem does this paper attempt to address?