All-Pairs Suffix-Prefix on Dynamic Set of Strings

Masaru Kikuchi,Shunsuke Inenaga
2024-07-25
Abstract:The all-pairs suffix-prefix (APSP) problem is a classical problem in string processing which has important applications in bioinformatics. Given a set $\mathcal{S} = \{S_1, \ldots, S_k\}$ of $k$ strings, the APSP problem asks one to compute the longest suffix of $S_i$ that is a prefix of $S_j$ for all $k^2$ ordered pairs $\langle S_i, S_j \rangle$ of strings in $\mathcal{S}$. In this paper, we consider the dynamic version of the APSP problem that allows for insertions of new strings to the set of strings. Our objective is, each time a new string $S_i$ arrives to the current set $\mathcal{S}_{i-1} = \{S_1, \ldots, S_{i-1}\}$ of $i-1$ strings, to compute (1) the longest suffix of $S_i$ that is a prefix of $S_j$ and (2) the longest prefix of $S_i$ that is a suffix of $S_j$ for all $1 \leq j \leq i$. We propose an $O(n)$-space data structure which computes (1) and (2) in $O(|S_i| \log \sigma + i)$ time for each new given string $S_i$, where $n$ is the total length of the strings.
Data Structures and Algorithms
What problem does this paper attempt to address?