Abstract:The problem of exact pattern matching is an essential programming problem. Different algorithms that solve this problem are core elements of search engines, version control systems, text editors, DNA analyzers, and many others. For simplification reasons articles usually denote pattern as P or p and pattern length as M or m. Similarly, the text is usually denoted as T or t and its length - N or n. Alphabet is denoted Σ and its length - |Σ|. Based on these notations the problem of pattern matching can be written as follows: Find all positions/ amount of i, such that P[0...m] = T[i...i + m], or: Find all positions i in text for which substring starting at position i of the text of length m is equal to the pattern. The main parameters of this problem are pattern length and alphabet size. The length of the text usually doesn’t matter because, for any long enough text of a specific structure, the run time of the algorithm per character will be close to constant. Besides that, the specifics of the input data and text may also impact the performances of the algorithms. All of that makes the problem both very nuanced and interesting to investigate. This problem features a lot of different existing solutions developed over the course of the last 5 decades. The main part of the work provides short descriptions and analyses of a set of algorithms that are still relevant in the field. Besides that, some remarks are made on the topic of their theoretical regions of efficiency and how they depend on the specifics of the input. The results of the practical experimentation on the variety of randomly generated test data are provided. The conclusion provides some analysis of the received results and algorithms’ class efficiency based on the input as well as a visual representation of the received results in a form of a table representing the most efficient algorithm for each pair of pattern length and alphabet size.

A probabilistic analysis of a pattern matching problem

Exact pattern matching. Current achievements and research

On the Practical Power of Automata in Pattern Matching

Algorithms for Parameterized String Matching with Mismatches

Exact Analysis of Pattern Matching Algorithms with Probabilistic Arithmetic Automata

Multiple pattern matching: A Markov chain approach

The complexity of pattern matching for a random string

Research on high-performance pattern matching algorithm

A Fast Improved Pattern Matching Algorithm for Biological Sequences

A Fast Exact Pattern Matching Algorithm for Biological Sequences

Finite automata, probabilistic method, and occurrence enumeration of a pattern in words and permutations

Fast Algorithms for Computing the Statistics of Pattern Matching

Computing Matching Statistics on Repetitive Texts

Pattern Matching with Mismatches and Wildcards

Near-Optimal-Time Quantum Algorithms for Approximate Pattern Matching

Faster two-dimensional pattern matching with $k$ mismatches

Order Preserving Matching

Deterministic Sparse Pattern Matching via the Baur-Strassen Theorem

$L_p$ Pattern Matching in a Stream

How to Find Long Maximal Exact Matches and Ignore Short Ones

A Comparative Study on String Matching Algorithm of Biological Sequences