The Temporal Structure of Language Processing in the Human Brain Corresponds to The Layered Hierarchy of Deep Language Models

Ariel Goldstein,Eric Ham,Mariano Schain,Samuel Nastase,Zaid Zada,Avigail Dabush,Bobbi Aubrey,Harshvardhan Gazula,Amir Feder,Werner K Doyle,Sasha Devore,Patricia Dugan,Daniel Friedman,Roi Reichart,Michael Brenner,Avinatan Hassidim,Orrin Devinsky,Adeen Flinker,Omer Levy,Uri Hasson
DOI: https://doi.org/10.1101/2022.07.11.499562
2023-10-11
Abstract:Deep Language Models (DLMs) provide a novel computational paradigm for understanding the mechanisms of natural language processing in the human brain. Unlike traditional psycholinguistic models, DLMs use layered sequences of continuous numerical vectors to represent words and context, allowing a plethora of emerging applications such as human-like text generation. In this paper we show evidence that the layered hierarchy of DLMs may be used to model the temporal dynamics of language comprehension in the brain by demonstrating a strong correlation between DLM layer depth and the time at which layers are most predictive of the human brain. Our ability to temporally resolve individual layers benefits from our use of electrocorticography (ECoG) data, which has a much higher temporal resolution than noninvasive methods like fMRI. Using ECoG, we record neural activity from participants listening to a 30-minute narrative while also feeding the same narrative to a high-performing DLM (GPT2-XL). We then extract contextual embeddings from the different layers of the DLM and use linear encoding models to predict neural activity. We first focus on the Inferior Frontal Gyrus (IFG, or Broca's area) and then extend our model to track the increasing temporal receptive window along the linguistic processing hierarchy from auditory to syntactic and semantic areas. Our results reveal a connection between human language processing and DLMs, with the DLM's layer-by-layer accumulation of contextual information mirroring the timing of neural activity in high-order language areas.
Computation and Language,Artificial Intelligence,Machine Learning,Neurons and Cognition
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to explore whether the hierarchical structure of deep language models (DLMs) can be used to model the temporal dynamics of the human brain during natural language processing. Specifically, the researchers used electrocorticography (ECoG) with high spatiotemporal resolution to record the neural activity in the language areas of participants while they were listening to a 30 - minute narrative, and simultaneously input the same narrative into high - performance deep language models (such as GPT2 - XL) to extract context embedding vectors at different levels. By predicting neural activity with a linear encoding model, the researchers attempted to verify whether the hierarchical structure of DLMs matches the time series of language processing in the human brain. ### Main research questions 1. **Can the hierarchical structure of deep language models simulate the temporal dynamics of the human brain in natural language understanding?** - By comparing the embedding vectors of each layer of DLMs with the neural activity recorded by ECoG, the researchers explored whether the hierarchical structure of DLMs can reflect the temporal order of information accumulation in the brain during language processing. 2. **Can the hierarchical structure of DLMs explain the temporal response characteristics of different language processing regions?** - The researchers not only focused on the inferior frontal gyrus (IFG), but also extended to other language - related brain regions (such as the anterior and middle parts of the superior temporal gyrus) to verify whether the hierarchical structure of DLMs can find corresponding temporal dynamics in a broader language - processing network. ### Method overview - **Data collection**: ECoG data were collected from the brains of nine epilepsy patients, who were recorded while listening to a 30 - minute audio podcast. - **Model selection**: The GPT2 - XL model was used to extract the context embedding vectors of each word in 48 levels. - **Analysis method**: By using a linear encoding model, the embedding vectors of each layer of DLMs were compared with the neural activity recorded by ECoG to evaluate the prediction performance of different - level embedding vectors on neural activity. ### Key findings - **Hierarchical temporal dynamics in the IFG region**: - In the IFG region, the embedding vectors of the early layers perform best in a short time before and after the appearance of a word, while the embedding vectors of the later layers perform best at a longer time later. This indicates that the hierarchical structure of DLMs matches the temporal order of information accumulation in the IFG region. - **Verification in other language regions**: - The researchers further carried out similar analyses in other language - related regions (such as the anterior and middle parts of the superior temporal gyrus), and the results showed that these regions also exhibit similar temporal dynamics, supporting the universality of the DLMs' hierarchical structure in different language - processing regions. ### Conclusion This study provides preliminary evidence indicating that the hierarchical structure of deep language models can be used to model the temporal dynamics of the human brain during natural language processing. This finding not only deepens our understanding of the human language - processing mechanism but also provides new ideas for developing more advanced cognitive models.