Can Evolutionary Computation Help us to Crib the Voynich Manuscript ?

Daniel Devatman Hromada
DOI: https://doi.org/10.48550/arXiv.2107.05381
2021-07-08
Abstract:Departing from the postulate that Voynich Manuscript is not a hoax but rather encodes authentic contents, our article presents an evolutionary algorithm which aims to find the most optimal mapping between voynichian glyphs and candidate phonemic values. Core component of the decoding algorithm is a process of maximization of a fitness function which aims to find most optimal set of substitution rules allowing to transcribe the part of the manuscript -- which we call the Calendar -- into lists of feminine names. This leads to sets of character subsitution rules which allow us to consistently transcribe dozens among three hundred calendar tokens into feminine names: a result far surpassing both ``popular'' as well as "state of the art" tentatives to crack the manuscript. What's more, by using name lists stemming from different languages as potential cribs, our ``adaptive'' method can also be useful in identification of the language in which the manuscript is written. As far as we can currently tell, results of our experiments indicate that the Calendar part of the manuscript contains names from baltoslavic, balkanic or hebrew language strata. Two further indications are also given: primo, highest fitness values were obtained when the crib list contains names with specific infixes at token's penultimate position as is the case, for example, for slavic \textbf{feminine diminutives} (i.e. names ending with -ka and not -a). In the most successful scenario, 240 characters contained in 35 distinct Voynichese tokens were successfully transcribed. Secundo, in case of crib stemming from Hebrew language, whole adaptation process converges to significantly better fitness values when transcribing voynichian tokens whose order of individual characters have been reversed, and when lists feminine and not masculine names are used as the crib.
Computation and Language,Information Retrieval
What problem does this paper attempt to address?