Abstract:A sentence is more than the sum of its words: its meaning depends on how they combine with one another. The brain mechanisms underlying such semantic composition remain poorly understood. To shed light on the neural vector code underlying semantic composition, we introduce two hypotheses: (1) the intrinsic dimensionality of the space of neural representations should increase as a sentence unfolds, paralleling the growing complexity of its semantic representation; and (2) this progressive integration should be reflected in ramping and sentence-final signals. To test these predictions, we designed a dataset of closely matched normal and jabberwocky sentences (composed of meaningless pseudo words) and displayed them to deep language models and to 11 human participants (5 men and 6 women) monitored with simultaneous MEG and intracranial EEG. In both deep language models and electrophysiological data, we found that representational dimensionality was higher for meaningful sentences than jabberwocky. Furthermore, multivariate decoding of normal versus jabberwocky confirmed three dynamic patterns: (1) a phasic pattern following each word, peaking in temporal and parietal areas; (2) a ramping pattern, characteristic of bilateral inferior and middle frontal gyri; and (3) a sentence-final pattern in left superior frontal gyrus and right orbitofrontal cortex. These results provide a first glimpse into the neural geometry of semantic integration and constrain the search for a neural code of linguistic composition.SIGNIFICANCE STATEMENT Starting from general linguistic concepts, we make two sets of predictions in neural signals evoked by reading multiword sentences. First, the intrinsic dimensionality of the representation should grow with additional meaningful words. Second, the neural dynamics should exhibit signatures of encoding, maintaining, and resolving semantic composition. We successfully validated these hypotheses in deep neural language models, artificial neural networks trained on text and performing very well on many natural language processing tasks. Then, using a unique combination of MEG and intracranial electrodes, we recorded high-resolution brain data from human participants while they read a controlled set of sentences. Time-resolved dimensionality analysis showed increasing dimensionality with meaning, and multivariate decoding allowed us to isolate the three dynamical patterns we had hypothesized.

Neural Encoding and Decoding With Distributed Sentence Representations

Delta-band Neural Activity Primarily Tracks Sentences Instead of Semantic Properties of Words

Towards Linguistic Neural Representation Learning and Sentence Retrieval from Electroencephalogram Recordings

Deep Neural Networks and Brain Alignment: Brain Encoding and Decoding (Survey)

Decoding Linguistic Representations of Human Brain

Neural Language Taskonomy: Which NLP Tasks are the most Predictive of fMRI Brain Activity?

A dual‐channel language decoding from brain activity with progressive transfer training

Decoding Visual Experience and Mapping Semantics through Whole-Brain Analysis Using fMRI Foundation Models

Exploring Semantic Representation in Brain Activity Using Word Embeddings.

Enhancing neural encoding models for naturalistic perception with a multi-level integration of deep neural networks and cortical networks

Describing Semantic Representations of Brain Activity Evoked by Visual Stimuli

Experiential, Distributional and Dependency-based Word Embeddings have Complementary Roles in Decoding Brain Activity

Neuro-Vision to Language: Enhancing Brain Recording-based Visual Reconstruction and Language Interaction

Decoding Brain Activity Associated with Literal and Metaphoric Sentence Comprehension Using Distributional Semantic Models

Deep learning models of cognitive processes constrained by human brain connectomes

Brain encoding models based on multimodal transformers can transfer across language and vision

Interpreting and improving natural-language processing (in machines) with natural language-processing (in the brain)

Decoding speech from non-invasive brain recordings

A neural decoding algorithm that generates language from visual activity evoked by natural images

Dimensionality and Ramping: Signatures of Sentence Integration in the Dynamics of Brains and Deep Language Models

Deep Recurrent Encoder: A scalable end-to-end network to model brain signals