Abstract:Linear sequences of words are implicitly represented in our brains by hierarchical structures that organize the composition of words in sentences. Linguists formalize different frameworks to model this hierarchy; two of the most common syntactic frameworks are Constituency and Dependency. Constituency represents sentences as nested groups of phrases, while dependency represents a sentence by assigning relations between its words. Recently, the pursuit of intelligent machines has produced Language Models (LMs) capable of solving many language tasks with a human-level performance. Many studies now question whether LMs implicitly represent syntactic hierarchies. This thesis focuses on producing constituency and dependency structures from LMs in an unsupervised setting. I review the critical methods in this field and highlight a line of work that utilizes a numerical representation for binary constituency trees (Syntactic Distance). I present a detailed study on StructFormer (SF) (Shen et al., 2021), which retrofits a transformer encoder architecture with a parser network to produce constituency and dependency structures. I present six experiments to analyze and address this field's challenges; experiments include investigating the effect of repositioning the parser network within the SF architecture, evaluating subword-based induced trees, and benchmarking the models developed in the thesis experiments on linguistic tasks. Models benchmarking is performed by participating in the BabyLM challenge, published at CoNLL 2023 (Momen et al., 2023). The results of this thesis encourage further development in the direction of retrofitting transformer-based models to induce syntactic structures, supported by the acceptable performance of SF in different experimental settings and the observed limitations that require innovative solutions to advance the state of syntactic structure induction.

Discourse structure interacts with reference but not syntax in neural language models

Does injecting linguistic structure into language models lead to better alignment with brain recordings?

Elife Assessment: Finding Structure During Incremental Speech Comprehension

Finding Structure in Language Models

Conceptual structure coheres in human cognition but not in large language models

How Do Local Syntactic Structures Influence Global Properties in Language Networks?

Neural reality of argument structure constructions

Syntactic Structure from Deep Learning

Structural Priming Demonstrates Abstract Grammatical Representations in Multilingual Language Models

Do Neural Language Models Show Preferences for Syntactic Formalisms?

Searching for Structure: Investigating Emergent Communication with Large Language Models

Evaluating Discourse in Structured Text Representations

Injecting structural hints: Using language models to study inductive biases in language learning

Do Language Models' Words Refer?

Structural Persistence in Language Models: Priming as a Window into Abstract Language Representations

Comparing Models of Associative Meaning: An Empirical Investigation of Reference in Simple Language Games

Modeling structure-building in the brain with CCG parsing and large language models

Linguistic Structure Induction from Language Models

Evaluating Neural Language Models as Cognitive Models of Language Acquisition

Recurrent Neural Network Language Models Always Learn English-Like Relative Clause Attachment

Structural Similarities Between Language Models and Neural Response Measurements