Abstract:Although word predictability is commonly considered an important factor in reading, sophisticated accounts of predictability in theories of reading are lacking. Computational models of reading traditionally use cloze norming as a proxy of word predictability, but what cloze norms precisely capture remains unclear. This study investigates whether large language models (LLMs) can fill this gap. Contextual predictions are implemented via a novel parallel-graded mechanism, where all predicted words at a given position are pre-activated as a function of contextual certainty, which varies dynamically as text processing unfolds. Through reading simulations with OB1-reader, a cognitive model of word recognition and eye-movement control in reading, we compare the model's fit to eye-movement data when using predictability values derived from a cloze task against those derived from LLMs (GPT-2 and LLaMA). Root Mean Square Error between simulated and human eye movements indicates that LLM predictability provides a better fit than cloze. This is the first study to use LLMs to augment a cognitive model of reading with higher-order language processing while proposing a mechanism on the interplay between word predictability and eye movements. Reading comprehension is a crucial skill that is highly predictive of later success in education. One aspect of efficient reading is our ability to predict what is coming next in the text based on the current context. Although we know predictions take place during reading, the mechanism through which contextual facilitation affects oculomotor behaviour in reading is not yet well-understood. Here, we model this mechanism and test different measures of predictability (computational vs. empirical) by simulating eye movements with a cognitive model of reading. Our results suggest that, when implemented with our novel mechanism, a computational measure of predictability provides better fits to eye movements in reading than a traditional empirical measure. With this model, we scrutinize how predictions about upcoming input affects eye movements in reading, and how computational approaches to measuring predictability may support theory testing. Modelling aspects of reading comprehension and testing them against human behaviour contributes to the effort of advancing theory building in reading research. In the longer term, more understanding of reading comprehension may help improve reading pedagogies, diagnoses and treatments.

Probabilistic Predictions of People Perusing: Evaluating Metrics of Language Model Performance for Psycholinguistic Modeling

Language models emulate certain cognitive profiles: An investigation of how predictability measures interact with individual differences

Transformer-Based Language Model Surprisal Predicts Human Reading Times Best with About Two Billion Training Tokens

On the Predictive Power of Neural Language Models for Human Real-Time Comprehension Behavior

Temperature-scaling surprisal estimates improve fit to human reading times -- but does it do so for the "right reasons"?

Testing the Predictions of Surprisal Theory in 11 Languages

Language Model Evaluation Beyond Perplexity

Language models outperform cloze predictability in a cognitive model of reading

Syntactic Surprisal From Neural Models Predicts, But Underestimates, Human Processing Difficulty From Syntactic Ambiguities

Optimizing Predictive Metrics for Human Reading Behavior

Assessing Language Models with Scaling Properties

Limits to Predicting Online Speech Using Large Language Models

A Targeted Assessment of Incremental Processing in Neural LanguageModels and Humans

Navigating Brain Language Representations: A Comparative Analysis of Neural Language Models and Psychologically Plausible Models

On the Role of Context in Reading Time Prediction

Reverse-Engineering the Reader

Generalized Measures of Anticipation and Responsivity in Online Language Processing

The Effect of Surprisal on Reading Times in Information Seeking and Repeated Reading

Evaluating Computational Language Models with Scaling Properties of Natural Language

Psychometric Predictive Power of Large Language Models