Abstract:Language models (LMs) have demonstrated remarkable proficiency in generating linguistically coherent text, sparking discussions about their relevance to understanding human language learnability. However, a significant gap exists between the training data for these models and the linguistic input a child receives. LMs are typically trained on data that is orders of magnitude larger and fundamentally different from child-directed speech (Warstadt and Bowman, 2022; Warstadt et al., 2023; Frank, 2023a). Addressing this discrepancy, our research focuses on training LMs on subsets of a single child's linguistic input. Previously, Wang, Vong, Kim, and Lake (2023) found that LMs trained in this setting can form syntactic and semantic word clusters and develop sensitivity to certain linguistic phenomena, but they only considered LSTMs and simpler neural networks trained from just one single-child dataset. Here, to examine the robustness of learnability from single-child input, we systematically train six different model architectures on five datasets (3 single-child and 2 baselines). We find that the models trained on single-child datasets showed consistent results that matched with previous work, underscoring the robustness of forming meaningful syntactic and semantic representations from a subset of a child's linguistic input.

What problem does this paper attempt to address?

This paper primarily investigates the ability of language models (LMs) to learn from the language input of individual children. Current LMs are typically trained on large-scale, adult data that differs greatly from the language input actually received by children. Researchers train LMs to process the speech input of a single child to observe whether they can form meaningful syntactic and semantic representations. The paper first points out that despite being efficient language learners, the mechanisms of language acquisition in children remain a mystery. Researchers compared them with Transformer-based large-scale language models (LLMs), which excel in generating coherent texts, sparking discussions about whether they reflect human language learning mechanisms. Previous studies mainly focused on training models on multi-child datasets, whereas this paper concentrates on data from individual children. The research methodology involves systematic training of six different model architectures on five datasets (three individual child datasets and two benchmark datasets). The results indicate that regardless of the variations in model architecture or dataset, the models are able to form syntactic and semantic categories similar to previous studies from the input of a single child, demonstrating the robustness of this learning ability. The paper employed various evaluation methods, including language acceptability tests, word vector visualization, and cloze tests, to examine the models' performance in different settings. All models demonstrated consistency in distinguishing between nouns, verbs, and other word categories, as well as sensitivity to certain linguistic phenomena. However, they still struggle with more complex language phenomena such as subject-verb agreement. In summary, the paper aims to address the question of whether language models can learn meaningful syntactic and semantic structures solely from the language input of a single child and whether this learning ability is universal. The research findings indicate that despite challenges, models can indeed simulate to some extent the language learning process of children.

A systematic investigation of learnability from single child linguistic input

Finding Structure in One Child's Linguistic Experience

Evaluating Neural Language Models as Cognitive Models of Language Acquisition

Is Child-Directed Speech Effective Training Data for Language Models?

From Babbling to Fluency: Evaluating the Evolution of Language Models in Terms of Human Language Acquisition

KidLM: Advancing Language Models for Children -- Early Insights and Future Directions

What Artificial Neural Networks Can Tell Us About Human Language Acquisition

Language acquisition: do children and language models follow similar learning stages?

Supervised Knowledge Makes Large Language Models Better In-context Learners

Spoken Language Intelligence of Large Language Models for Language Learning

Language Models Don't Learn the Physical Manifestation of Language

ZhoBLiMP: a Systematic Assessment of Language Models with Linguistic Minimal Pairs in Chinese

Real-time implementation of synthetic aperture vector flow imaging on a consumer-level tablet

Can training neural language models on a curriculum with developmentally plausible data improve alignment with human reading behavior?

Exploring the LLM Journey from Cognition to Expression with Linear Representations

CLIMB: Curriculum Learning for Infant-inspired Model Building

Acquiring Linguistic Knowledge from Multimodal Input

Limits for Learning with Language Models

A Language-agnostic Model of Child Language Acquisition

Opening the black box of language acquisition

Less is More: Pre-Training Cross-Lingual Small-Scale Language Models with Cognitively-Plausible Curriculum Learning Strategies