Abstract:Recursive processing in sentence comprehension is considered a hallmark of human linguistic abilities. However, its underlying neural mechanisms remain largely unknown. We studied whether a modern artificial neural network trained with "deep learning" methods mimics a central aspect of human sentence processing, namely the storing of grammatical number and gender information in working memory and its use in long-distance agreement (e.g., capturing the correct number agreement between subject and verb when they are separated by other phrases). Although the network, a recurrent architecture with Long Short-Term Memory units, was solely trained to predict the next word in a large corpus, analysis showed the emergence of a very sparse set of specialized units that successfully handled local and long-distance syntactic agreement for grammatical number. However, the simulations also showed that this mechanism does not support full recursion and fails with some long-range embedded dependencies. We tested the model's predictions in a behavioral experiment where humans detected violations in number agreement in sentences with systematic variations in the singular/plural status of multiple nouns, with or without embedding. Human and model error patterns were remarkably similar, showing that the model echoes various effects observed in human data. However, a key difference was that, with embedded long-range dependencies, humans remained above chance level, while the model's systematic errors brought it below chance. Overall, our study shows that exploring the ways in which modern artificial neural networks process sentences leads to precise and testable hypotheses about human linguistic performance.

Colorless green recurrent networks dream hierarchically

Residual Recurrent Neural Networks for Learning Sequential Representations.

When Are Tree Structures Necessary for Deep Learning of Representations?

Do RNNs learn human-like abstract word order preferences?

Mechanisms for handling nested dependencies in neural-network language models and humans

Representation of linguistic form and function in recurrent neural networks

Recurrent babbling: evaluating the acquisition of grammar from limited input data

On the Practical Ability of Recurrent Neural Networks to Recognize Hierarchical Languages

Recurrent Memory Networks for Language Modeling

Learning Longer Memory in Recurrent Neural Networks

Which Neural Network Architecture matches Human Behavior in Artificial Grammar Learning?

Recurrent Neural Networks Learn to Store and Generate Sequences using Non-Linear Representations

What Syntactic Structures block Dependencies in RNN Language Models?

Subregular Complexity and Deep Learning

Hierarchically Gated Recurrent Neural Network for Sequence Modeling

Do RNN States Encode Abstract Phonological Processes?

Precision, Stability, and Generalization: A Comprehensive Assessment of RNNs learnability capability for Classifying Counter and Dyck Languages

RNNs can generate bounded hierarchical languages with optimal memory

Recurrent Neural Network Language Models Always Learn English-Like Relative Clause Attachment

Riemannian metrics for neural networks II: recurrent networks and learning symbolic data sequences

Can LSTM Learn to Capture Agreement? The Case of Basque