Low-Rank Hidden State Embeddings for Viterbi Sequence Labeling

Dung Thai,Shikhar Murty,Trapit Bansal,Luke Vilnis,David Belanger,Andrew McCallum
DOI: https://doi.org/10.48550/arXiv.1708.00553
2017-08-02
Abstract:In textual information extraction and other sequence labeling tasks it is now common to use recurrent neural networks (such as LSTM) to form rich embedded representations of long-term input co-occurrence patterns. Representation of output co-occurrence patterns is typically limited to a hand-designed graphical model, such as a linear-chain CRF representing short-term Markov dependencies among successive labels. This paper presents a method that learns embedded representations of latent output structure in sequence data. Our model takes the form of a finite-state machine with a large number of latent states per label (a latent variable CRF), where the state-transition matrix is factorized---effectively forming an embedded representation of state-transitions capable of enforcing long-term label dependencies, while supporting exact Viterbi inference over output labels. We demonstrate accuracy improvements and interpretable latent structure in a synthetic but complex task based on CoNLL named entity recognition.
Computation and Language
What problem does this paper attempt to address?