A pre-processing algorithm for improving efficiency of the Boyer-Moore algorithm

Renchao Jin,Enmin Song
DOI: https://doi.org/10.3321/j.issn:1671-4512.2005.z1.074
2005-01-01
Abstract:The original Boyer-Moore algorithm can be easily adjusted to match the pattern in the alternative direction. Theoretical analysis and experiments showed that for most string patterns, matching a pattern in one direction is always more efficient than in other direction for almost all texts. A pre-processing algorithm with time complexity O(σm) and space complexity O(σ+m) was proposed, where σ and m are the size of the alphabet and the length of the pattern respectively. The algorithm determines an optimal matching direction by examining the pattern only and then uses the Boyer-Moore algorithm in the optimal direction to match the pattern. In the experiments, 1000 DNA sequences from a human gene database with each length being more than 1000 were chosen as the texts, then 1000 sequence segments with each length being 20 were randomly chosen from these sequences as the patterns. The results showed that the proposed algorithm can averagely reduce the time of pattern matching to 90% of that of the original algorithm.
What problem does this paper attempt to address?