Streaming Pattern Matching with d Wildcards

Shay Golan,Tsvi Kopelowitz,Ely Porat
DOI: https://doi.org/10.1007/s00453-018-0521-7
2017-04-06
Abstract:In the pattern matching with $d$ wildcards problem one is given a text $T$ of length $n$ and a pattern $P$ of length $m$ that contains $d$ wildcard characters, each denoted by a special symbol $'?'$. A wildcard character matches any other character. The goal is to establish for each $m$-length substring of $T$ whether it matches $P$. In the streaming model variant of the pattern matching with $d$ wildcards problem the text $T$ arrives one character at a time and the goal is to report, before the next character arrives, if the last $m$ characters match $P$ while using only $o(m)$ words of space. In this paper we introduce two new algorithms for the $d$ wildcard pattern matching problem in the streaming model. The first is a randomized Monte Carlo algorithm that is parameterized by a constant $0\leq \delta \leq 1$. This algorithm uses $\tilde{O}(d^{1-\delta})$ amortized time per character and $\tilde{O}(d^{1+\delta})$ words of space. The second algorithm, which is used as a black box in the first algorithm, is a randomized Monte Carlo algorithm which uses $O(d+\log m)$ worst-case time per character and $O(d\log m)$ words of space.
Data Structures and Algorithms
What problem does this paper attempt to address?