Fast Algorithms for the Shortest Unique Palindromic Substring Problem on Run-Length Encoded Strings

Kiichi Watanabe,Yuto Nakashima,Shunsuke Inenaga,Hideo Bannai,Masayuki Takeda
DOI: https://doi.org/10.48550/arXiv.1903.06290
2020-03-24
Abstract:For a string $S$, a palindromic substring $S[i..j]$ is said to be a \emph{shortest unique palindromic substring} ($\mathit{SUPS}$) for an interval $[s, t]$ in $S$, if $S[i..j]$ occurs exactly once in $S$, the interval $[i, j]$ contains $[s, t]$, and every palindromic substring containing $[s, t]$ which is shorter than $S[i..j]$ occurs at least twice in $S$. In this paper, we study the problem of answering $\mathit{SUPS}$ queries on run-length encoded strings. We show how to preprocess a given run-length encoded string $\mathit{RLE}_{S}$ of size $m$ in $O(m)$ space and $O(m \log \sigma_{\mathit{RLE}_{S}} + m \sqrt{\log m / \log\log m})$ time so that all $\mathit{SUPSs}$ for any subsequent query interval can be answered in $O(\sqrt{\log m / \log\log m} + \alpha)$ time, where $\alpha$ is the number of outputs, and $\sigma_{\mathit{RLE}_{S}}$ is the number of distinct runs of $\mathit{RLE}_{S}$. Additionaly, we consider a variant of the SUPS problem where a query interval is also given in a run-length encoded form. For this variant of the problem, we present two alternative algorithms with faster queries. The first one answers queries in $O(\sqrt{\log\log m /\log\log\log m} + \alpha)$ time and can be built in $O(m \log \sigma_{\mathit{RLE}_{S}} + m \sqrt{\log m / \log\log m})$ time, and the second one answers queries in $O(\log \log m + \alpha)$ time and can be built in $O(m \log \sigma_{\mathit{RLE}_{S}})$ time. Both of these data structures require $O(m)$ space.
Data Structures and Algorithms
What problem does this paper attempt to address?