Abstract:Theoretical linguistics seeks to explain what human language is, and why. Linguists and cognitive scientists have proposed different theoretical models of what language is, as well as cognitive factors that shape it, and allow humans to 'produce', 'understand', and 'acquire' natural languages. However, humans may no longer be the only ones learning to 'generate', 'parse', and 'learn' natural language: artificial intelligence (AI) models such as large language models are proving to have impressive linguistic capabilities. Many are thus questioning what role, if any, such models should play in helping theoretical linguistics reach its ultimate research goals? In this paper, we propose to answer this question, by reiterating the tenets of generative linguistics, a leading school of thought in the field, and by considering how AI models as theories of language relate to each of these important concepts. Specifically, we consider three foundational principles, finding roots in the early works of Noam Chomsky: (1) levels of theoretical adequacy; (2) procedures for linguistic theory development; (3) language learnability and Universal Grammar. In our discussions of each principle, we give special attention to two types of AI models: neural language models and neural grammar induction models. We will argue that such models, in particular neural grammar induction models, do have a role to play, but that this role is largely modulated by the stance one takes regarding each of these three guiding principles.

What problem does this paper attempt to address?

This paper attempts to explore how theoretical linguistics should continue its research goals in the era of artificial intelligence, especially after the emergence of large - language models (LLMs). Specifically, the paper re - examines the main theoretical foundations of generative grammar and considers the relationship between AI models as language theories and these fundamental concepts. The paper mainly focuses on three core principles: 1. **Levels of theoretical adequacy**: This refers to the different levels of standards that a linguistic theory or model achieves when describing linguistic phenomena, including observational adequacy, descriptive adequacy, and explanatory adequacy. 2. **Procedures for the development of linguistic theories**: This involves discovery, decision - making, and evaluation procedures, that is, how to discover the best grammar from data, how to decide whether a certain grammar is the best choice, and how to evaluate the best candidate among a set of possible grammars. 3. **Language learnability and universal grammar**: This is the theory about how languages are learned and how human language learning affects language structures. The paper specifically discusses two types of AI models: neural language models (such as LLMs) and neural grammar induction models. The author believes that although neural language models perform excellently in language processing, they mostly remain at the level of observational adequacy and are difficult to provide descriptions or explanations of language behaviors. In contrast, neural grammar induction models can learn explicit syntactic rules, thus having greater potential in terms of descriptive and explanatory adequacy. These models can not only simulate "adult" grammars but also the entire language acquisition process, so they have important application prospects in the research of theoretical linguistics.

"On the goals of linguistic theory": Revisiting Chomskyan theories in the era of AI

Why Linguistics Will Thrive in the 21st Century: A Reply to Piantadosi (2023)

Generative linguistics contribution to artificial intelligence: Where this contribution lies?

Integrating Linguistic Theory and Neural Language Models

What Artificial Neural Networks Can Tell Us About Human Language Acquisition

A Philosophical Introduction to Language Models -- Part I: Continuity With Classic Debates

Exploring the Role of Generative AI in Enhancing Language Learning: Opportunities and Challenges

Artificial intelligence in applied (linguistics): a content analysis and future prospects

A Critical Evaluation of the Theory of Universal Grammar and its Contribution to Second Language Learning and Teaching

Generative AI, Pragmatics, and Authenticity in Second Language Learning

On the proper role of linguistically-oriented deep net analysis in linguistic theorizing

Large Language Models and Generative AI, Oh My!

Large language models as linguistic simulators and cognitive models in human research

Generalisation of language and knowledge models for corpus analysis

Language Models as Models of Language

Universal Grammar and Universal Grammar’s Influence and Related Theories Concerning Second Language Acquisition

Evaluating Neural Language Models as Cognitive Models of Language Acquisition

Large Language Models: A Historical and Sociocultural Perspective

Large language models and linguistic intentionality

Towards a neural architecture of language: Deep learning versus logistics of access in neural architectures for compositional processing

Frontier AI Ethics: Anticipating and Evaluating the Societal Impacts of Language Model Agents