Abstract:Although word predictability is commonly considered an important factor in reading, sophisticated accounts of predictability in theories of reading are lacking. Computational models of reading traditionally use cloze norming as a proxy of word predictability, but what cloze norms precisely capture remains unclear. This study investigates whether large language models (LLMs) can fill this gap. Contextual predictions are implemented via a novel parallel-graded mechanism, where all predicted words at a given position are pre-activated as a function of contextual certainty, which varies dynamically as text processing unfolds. Through reading simulations with OB1-reader, a cognitive model of word recognition and eye-movement control in reading, we compare the model's fit to eye-movement data when using predictability values derived from a cloze task against those derived from LLMs (GPT-2 and LLaMA). Root Mean Square Error between simulated and human eye movements indicates that LLM predictability provides a better fit than cloze. This is the first study to use LLMs to augment a cognitive model of reading with higher-order language processing while proposing a mechanism on the interplay between word predictability and eye movements. Reading comprehension is a crucial skill that is highly predictive of later success in education. One aspect of efficient reading is our ability to predict what is coming next in the text based on the current context. Although we know predictions take place during reading, the mechanism through which contextual facilitation affects oculomotor behaviour in reading is not yet well-understood. Here, we model this mechanism and test different measures of predictability (computational vs. empirical) by simulating eye movements with a cognitive model of reading. Our results suggest that, when implemented with our novel mechanism, a computational measure of predictability provides better fits to eye movements in reading than a traditional empirical measure. With this model, we scrutinize how predictions about upcoming input affects eye movements in reading, and how computational approaches to measuring predictability may support theory testing. Modelling aspects of reading comprehension and testing them against human behaviour contributes to the effort of advancing theory building in reading research. In the longer term, more understanding of reading comprehension may help improve reading pedagogies, diagnoses and treatments.

Language Models Outperform Cloze Predictability in a Cognitive Model of Reading

Language models outperform cloze predictability in a cognitive model of reading

Language models emulate certain cognitive profiles: An investigation of how predictability measures interact with individual differences

On the Predictive Power of Neural Language Models for Human Real-Time Comprehension Behavior

Probabilistic Predictions of People Perusing: Evaluating Metrics of Language Model Performance for Psycholinguistic Modeling

Integrating large language models and active inference to understand eye movements in reading and dyslexia

Humans and language models diverge when predicting repeating text

Machine-Learned Computational Models Can Enhance the Study of Text and Discourse: A Case Study Using Eye Tracking to Model Reading Comprehension

Large-scale cloze evaluation reveals that token prediction tasks are neither lexically nor semantically aligned

Psychometric Predictive Power of Large Language Models

Multilingual Language Models Predict Human Reading Behavior

Prediction in reading: A review of predictability effects, their theoretical implications, and beyond

Do Large Language Models Exhibit Cognitive Dissonance? Studying the Difference Between Revealed Beliefs and Stated Answers

Cloze probability, predictability ratings, and computational estimates for 205 English sentences, aligned with existing EEG and reading time data

Large GPT-like Models are Bad Babies: A Closer Look at the Relationship between Linguistic Competence and Psycholinguistic Measures

Speech language models lack important brain-relevant semantics

Reverse-Engineering the Reader

Cross-Lingual Transfer of Cognitive Processing Complexity

Gender and Age Differences in Lipid Profile Among Chinese Adults in Nanjing: a Retrospective Study of Over 230,000 Individuals from 2009 to 2015.

LLMs' Reading Comprehension Is Affected by Parametric Knowledge and Struggles with Hypothetical Statements