Exact Hard Monotonic Attention for Character-Level Transduction

Shijie Wu,Ryan Cotterell
2024-02-20
Abstract:Many common character-level, string-to string transduction tasks, e.g., grapheme-tophoneme conversion and morphological inflection, consist almost exclusively of monotonic transductions. However, neural sequence-to sequence models that use non-monotonic soft attention often outperform popular monotonic models. In this work, we ask the following question: Is monotonicity really a helpful inductive bias for these tasks? We develop a hard attention sequence-to-sequence model that enforces strict monotonicity and learns a latent alignment jointly while learning to transduce. With the help of dynamic programming, we are able to compute the exact marginalization over all monotonic alignments. Our models achieve state-of-the-art performance on morphological inflection. Furthermore, we find strong performance on two other character-level transduction tasks. Code is available at <a class="link-external link-https" href="https://github.com/shijie-wu/neural-transducer" rel="external noopener nofollow">this https URL</a>.
Computation and Language
What problem does this paper attempt to address?