Words Avoiding Tangrams

Michał Dębski,Jarosław Grytczuk,Bartłomiej Pawlik,Jakub Przybyło,Małgorzata Śleszyńska-Nowak
2024-07-04
Abstract:A \emph{tangram} is a word in which every letter occurs an even number of times. Such word can be cut into parts that can be arranged into two identical words. The minimum number of cuts needed is called the \emph{cut number} of a tangram. For example, the word $\mathtt{\color{red}{0102}\color{blue}{0102}}$ is a tangram with cut number one, while the word $\mathtt{\color{red}{01}\color{blue}{01023}\color{red}{023}}$ is a tangram with cut number two. Clearly, tangrams with cut number one coincide with the well known family of words, known as \emph{squares}, having the form $UU$ for some nonempty word $U$. A word $W$ \emph{avoids} a word $T$ if it is not possible to write $W=ATB$, for any words $A$ and $B$ (possibly empty). The famous 1906 theorem of Thue asserts that there exist arbitrarily long words avoiding squares over alphabet with just \emph{three} letters. Given a fixed number $k\geqslant 1$, how many letters are needed to avoid tangrams with the cut number at most $k$? Let $t(k)$ denote the minimum size of an alphabet needed for that purpose. By Thue's result we have $t(1)=3$, which easily implies $t(2)=3$. Curiously, these are currently the only known exact values of this function. In our main result we prove that $t(k)=\Theta(\log_2k)$. The proof uses \emph{entropy compression} argument and \emph{Zimin words}. By using a different method we prove that $t(k)\leqslant k+1$ for all $k\geqslant 4$, which gives more exact estimates for small values of $k$. The proof makes use of \emph{Dejean words} and a curious property of \emph{Gauss words}, which is perhaps of independent interest.
Combinatorics
What problem does this paper attempt to address?