Estimating the Lengths of Memory Words

Gusztav Morvai,Benjamin Weiss
DOI: https://doi.org/10.48550/arXiv.0808.2964
2008-08-22
Abstract:For a stationary stochastic process $\{X_n\}$ with values in some set $A$, a finite word $w \in A^K$ is called a memory word if the conditional probability of $X_0$ given the past is constant on the cylinder set defined by $X_{-K}^{-1}=w$. It is a called a minimal memory word if no proper suffix of $w$ is also a memory word. For example in a $K$-step Markov processes all words of length $K$ are memory words but not necessarily minimal. We consider the problem of determining the lengths of the longest minimal memory words and the shortest memory words of an unknown process $\{X_n\}$ based on sequentially observing the outputs of a single sample $\{\xi_1,\xi_2,...\xi_n\}$. We will give a universal estimator which converges almost surely to the length of the longest minimal memory word and show that no such universal estimator exists for the length of the shortest memory word. The alphabet $A$ may be finite or countable.
Information Theory
What problem does this paper attempt to address?