Neural Language Taskonomy: Which NLP Tasks are the most Predictive of fMRI Brain Activity?

Subba Reddy Oota,Jashn Arora,Veeral Agarwal,Mounika Marreddy,Manish Gupta,Bapi Raju Surampudi
DOI: https://doi.org/10.18653/v1/2022.naacl-main.235
2022-05-03
Abstract:Several popular Transformer based language models have been found to be successful for text-driven brain encoding. However, existing literature leverages only pretrained text Transformer models and has not explored the efficacy of task-specific learned Transformer representations. In this work, we explore transfer learning from representations learned for ten popular natural language processing tasks (two syntactic and eight semantic) for predicting brain responses from two diverse datasets: Pereira (subjects reading sentences from paragraphs) and Narratives (subjects listening to the spoken stories). Encoding models based on task features are used to predict activity in different regions across the whole brain. Features from coreference resolution, NER, and shallow syntax parsing explain greater variance for the reading activity. On the other hand, for the listening activity, tasks such as paraphrase generation, summarization, and natural language inference show better encoding performance. Experiments across all 10 task representations provide the following cognitive insights: (i) language left hemisphere has higher predictive brain activity versus language right hemisphere, (ii) posterior medial cortex, temporo-parieto-occipital junction, dorsal frontal lobe have higher correlation versus early auditory and auditory association cortex, (iii) syntactic and semantic tasks display a good predictive performance across brain regions for reading and listening stimuli resp.
Computation and Language,Artificial Intelligence,Machine Learning,Neurons and Cognition
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: Which natural language processing tasks are most predictive of fMRI brain activity? Specifically, the researchers explored the representations learned from ten popular natural language processing tasks (two syntactic tasks and eight semantic tasks) for predicting brain responses from two different datasets: the Pereira dataset (subjects read sentences) and the Narratives dataset (subjects listen to stories). By using an encoding model based on task features to predict the activity of different regions of the brain, the researchers aimed to reveal which NLP tasks are most effective in predicting brain responses in reading and auditory activities. ### Main Contributions 1. **Task Selection and Evaluation**: Given Transformer models fine - tuned for various NLP tasks, the researchers proposed the question of which tasks are most predictive of fMRI brain activity, especially in reading and story - listening tasks. 2. **Task Performance**: - **Reading Task**: Coreference Resolution (CR), Named Entity Recognition (NER), and Shallow Syntax Parsing (SS) have higher predictive performance when reading text. - **Listening - to - Stories Task**: Paraphrase Detection (PD), Summarization (Sum), and Natural Language Inference (NLI) show better correlations when listening to stories. ### Experimental Methods - **Datasets**: The study used the Pereira dataset (subjects read sentences) and the Narratives - Pieman dataset (subjects listen to stories). - **Model**: A Ridge regression model was used to predict brain responses, and 2V2 accuracy and Pearson correlation coefficient were calculated as evaluation metrics. - **Feature Space**: Implicit space features were extracted from ten NLP tasks, including coreference resolution, named entity recognition, natural language inference, paraphrase detection, question answering, sentiment analysis, semantic role labeling, shallow syntax parsing, summarization, and word - sense disambiguation. ### Results - **Reading Task**: CR, NER, SRL, and SS show higher correlations when predicting brain responses. In particular, the left - hemisphere language processing region (Language_LH) has higher encoding performance than the right - hemisphere (Language_RH). - **Listening - to - Stories Task**: PD, Sum, and NLI show higher correlations when predicting brain responses. In particular, the bilateral posterior medial cortex (PMC), which is related to high - level language functions, shows the highest correlation. ### Cognitive Insights - **Left - Right Hemisphere Differences**: The left hemisphere has higher predictive performance in language processing, which is consistent with the known left - hemisphere language dominance. - **Brain Region Differences**: The posterior medial cortex, temporoparietal - occipital junction, and dorsal frontal lobe show higher correlations when predicting brain responses. ### Conclusions There are significant differences in the predictive ability of different language tasks for brain activity in reading and listening - to - stories tasks. Shallow tasks such as NER and SS are more effective in reading tasks, while complex NLP tasks such as PD, Sum, and NLI are more effective in listening - to - stories tasks. These findings are helpful for understanding the mechanisms of the brain in processing different types of language stimuli.