Dependency Grammar Induction with Neural Lexicalization and Big Training Data

Wenjuan Han,Yong Jiang,Kewei Tu
DOI: https://doi.org/10.48550/arXiv.1708.00801
2017-08-02
Abstract:We study the impact of big models (in terms of the degree of lexicalization) and big data (in terms of the training corpus size) on dependency grammar induction. We experimented with L-DMV, a lexicalized version of Dependency Model with Valence and L-NDMV, our lexicalized extension of the Neural Dependency Model with Valence. We find that L-DMV only benefits from very small degrees of lexicalization and moderate sizes of training corpora. L-NDMV can benefit from big training data and lexicalization of greater degrees, especially when enhanced with good model initialization, and it achieves a result that is competitive with the current state-of-the-art.
Computation and Language
What problem does this paper attempt to address?