A Myhill-Nerode theorem for register automata and symbolic trace languages
Frits Vaandrager,Abhisek Midya
DOI: https://doi.org/10.1016/j.tcs.2022.01.015
IF: 1.002
2022-04-01
Theoretical Computer Science
Abstract:We propose a new symbolic trace semantics for register automata (extended finite state machines) which records both the sequence of input symbols that occur during a run as well as the constraints on input parameters that are imposed by this run. Our main result is a generalization of the classical Myhill-Nerode theorem to this symbolic setting. Our generalization requires the use of three relations to capture the additional structure of register automata. Location equivalence ≡ l captures that symbolic traces end in the same location, transition equivalence ≡ t captures that they share the same final transition, and a partial equivalence relation ≡ r captures that symbolic values v and v ′ are stored in the same register after symbolic traces w and w ′ , respectively. A symbolic language is defined to be regular if relations ≡ l , ≡ t and ≡ r exist that satisfy certain conditions, in particular, they all have finite index. We show that the symbolic language associated to a register automaton is regular, and we construct, for each regular symbolic language, a register automaton that accepts this language. Our result provides a foundation for grey-box learning algorithms in settings where the constraints on data parameters can be extracted from code using e.g. tools for symbolic/concolic execution or tainting. Moving to a grey-box setting may overcome the scalability problems of state-of-the-art black-box learning algorithms.
computer science, theory & methods