Modeling the Complexity and Descriptive Adequacy of Construction Grammars

Jonathan Dunn
DOI: https://doi.org/10.48550/arXiv.1904.05588
2019-04-11
Abstract:This paper uses the Minimum Description Length paradigm to model the complexity of CxGs (operationalized as the encoding size of a grammar) alongside their descriptive adequacy (operationalized as the encoding size of a corpus given a grammar). These two quantities are combined to measure the quality of potential CxGs against unannotated corpora, supporting discovery-device CxGs for English, Spanish, French, German, and Italian. The results show (i) that these grammars provide significant generalizations as measured using compression and (ii) that more complex CxGs with access to multiple levels of representation provide greater generalizations than single-representation CxGs.
Computation and Language
What problem does this paper attempt to address?