Abstract:Building computational models to account for the cortical representation of language plays an important role in understanding the human linguistic system. Recent progress in distributed semantic models (DSMs), especially transformer-based methods, has driven advances in many language understanding tasks, making DSM a promising methodology to probe brain language processing. DSMs have been shown to reliably explain cortical responses to word stimuli. However, characterizing the brain activities for sentence processing is much less exhaustively explored with DSMs, especially the deep neural network-based methods. What is the relationship between cortical sentence representations against DSMs? What linguistic features that a DSM catches better explain its correlation with the brain activities aroused by sentence stimuli? Could distributed sentence representations help to reveal the semantic selectivity of different brain areas? We address these questions through the lens of neural encoding and decoding, fueled by the latest developments in natural language representation learning. We begin by evaluating the ability of a wide range of 12 DSMs to predict and decipher the functional magnetic resonance imaging (fMRI) images from humans reading sentences. Most models deliver high accuracy in the left middle temporal gyrus (LMTG) and left occipital complex (LOC). Notably, encoders trained with transformer-based DSMs consistently outperform other unsupervised structured models and all the unstructured baselines. With probing and ablation tasks, we further find that differences in the performance of the DSMs in modeling brain activities can be at least partially explained by the granularity of their semantic representations. We also illustrate the DSM's selectivity for concept categories and show that the topics are represented by spatially overlapping and distributed cortical patterns. Our results corroborate and extend previous findings in understanding t-e relation between DSMs and neural activation patterns and contribute to building solid brain–machine interfaces with deep neural network representations.

Subword Representations Successfully Decode Brain Responses to Morphologically Complex Written Words

Information properties of morphologically complex words modulate brain activity during word reading

Semantic reconstruction of continuous language from MEG signals

Decoding individual words from non-invasive brain recordings across 723 participants

Reading visually embodied meaning from the brain: Visually grounded computational models decode visual-object mental imagery induced by written text

Statistical models of morphology predict eye-tracking measures during visual word recognition

Modality-Agnostic fMRI Decoding of Vision and Language

Brain2Word: Decoding Brain Activity for Language Generation

Decoding visual percepts induced by word reading with fMRI

Decoding Visual Experience and Mapping Semantics through Whole-Brain Analysis Using fMRI Foundation Models

Which Sense Dominates Multisensory Semantic Understanding? A Brain Decoding Study

Decoding Semantics from Dynamic Brain Activation Patterns: From Trials to Task in EEG/MEG Source Space

Decoding Brain Activity Associated with Literal and Metaphoric Sentence Comprehension Using Distributional Semantic Models

EEG decoding of spoken words in bilingual listeners: from words to language invariant semantic-conceptual representations

Experiential, Distributional and Dependency-based Word Embeddings have Complementary Roles in Decoding Brain Activity

Decoding speech perception from non-invasive brain recordings

MindFormer: Semantic Alignment of Multi-Subject fMRI for Brain Decoding

Long short‐term memory‐based neural decoding of object categories evoked by natural images

Decoding speech from non-invasive brain recordings

Decoding Semantics Categorization During Natural Viewing of Video Streams.

Neural Encoding and Decoding With Distributed Sentence Representations