Efficient Approximate Substring Matching In Compressed String

Yutong Han,Bin Wang,Xiaochun Yang
DOI: https://doi.org/10.1007/978-3-319-39958-4_15
2016-01-01
Abstract:The idea of LZ77 self-index has been proposed for repetitive text in compressed forms. Existing methods of approximate string matching based on LZ77 focus on space efficiency. We focus on how to efficiently search similar strings in text without decompressing the whole text. We propose RS-search algorithm to merge all the occurrences of substring efficiently to narrow down the potential region and design novel filterings to reduce the scale of candidates. The experiments show that our algorithm achieves outstanding performance and an interesting time-space trade-off in approximate matching for compressed string.
What problem does this paper attempt to address?